Hi Moe,

Thanks for the hints.

Here's a patch set that implement option 3). Branched from
9a48840da4feb7a5810b3024886423b38cdb3bb7. Also available on my github:
https://github.com/nbigaouette/slurm

You will also find attached a patch (0001-Ignore-temp-files.patch) which
sets git to ignore built files.

Options 1) and 2) requires deeper understanding of the code which I don't
have for now. Reading /etc/slurm/gres.conf each time a job is run is
definitely not perfect, but it shouldn't make things slower, except maybe
for a large number of small jobs (a couple of seconds say).

Nicolas


On Mon, Jan 30, 2012 at 1:31 PM, Moe Jette <[email protected]> wrote:

> The gres.conf file is ready by the slurmd daemon while the task launch and
> claiming GRES is done by the slurmstepd job step shepherd. Here are some
> options:
> 1. Move the gres_plugin_step_set_env() and gres_plugin_job_set_env() calls
> from slurmstepd to slurmd (this is probably the most efficient solution,
> but could break gres plugins that other people have developed)
> 2. Add a new gres plugin call for the slurmd to set the CUDA environment
> variables based upon file names that it already has and leave the other
> function calls in slurmstepd (more work, will not break any gres plugins
> developed by other people but still very efficient) OR
> 3. Modify slurmstepd to read gres.conf to get the nvidia device numbers
> (the simplest solution, but requires extra overhead for each job launch)
>
>
>
> Quoting Nicolas Bigaouette <[email protected]>:
>
>  Hi Moe, thanks for the indications.
>>
>> On Tue, Jan 24, 2012 at 4:15 PM, Moe Jette <[email protected]> wrote:
>>
>>  In the case of gres/gpu, it just sets the CUDA_VISIBLE_DEVICES
>>> environment
>>> variable based upon the position(s) in the bitmap allocated to the job or
>>> step.
>>>
>>>
>>  Probably the simplest way to get the correct environment variable would
>>> be
>>> to modify node_config_load() to cache device numbers and then use those
>>> device numbers rather than the bitmap index to set CUDA_VISIBLE_DEVICES
>>> values in job_set_env() and step_set_env()
>>>
>>> This is exactly what I'm trying to do. I'm familiarizing myself with the
>>>
>> code and experimenting some stuff. Unfortunately, I don't see how
>> information can be "transfered" from node_config_load() to job_set_env().
>> node_config_load() only takes the file entries as input arguments and does
>> not have any output variables. Also, a variable global to the gres_gpu.c
>> file does not work as, it seems, job_set_env() is executed as a different
>> process then node_config_load() and as such is not sharing memory. It
>> might
>> not be exactly this situation, but the memory is definitely not shared and
>> thus job_set_env() cannot access variables set by node_config_load().
>>
>> So either there is a simple way for that sharing of information that I did
>> not found, or information will have to be passed through function
>> arguments. But then that would change the API...
>>
>> I hope I'm just missing something obvious somewhere! How is
>> node_config_load() supposed to configure anything if job_set_env() can't
>> have access to that information?
>>
>> Thanks
>>
>> Nicolas
>>
>>
>
>
>

Attachment: 0001-Load-node-s-gres-configuration-before-setting-its-en.patch
Description: Binary data

Attachment: 0002-Keep-node-s-gres-devices-list-in-cache.patch
Description: Binary data

Attachment: 0003-Use-the-cached-gres-devices-list.patch
Description: Binary data

Attachment: 0001-Ignore-temp-files.patch
Description: Binary data

Reply via email to