Re: [slurm-dev] gres.conf's "File=" flag ignore

Nicolas Bigaouette Sun, 29 Jan 2012 09:26:21 -0800

Hi Moe, thanks for the indications.

On Tue, Jan 24, 2012 at 4:15 PM, Moe Jette <je...@schedmd.com> wrote:


> In the case of gres/gpu, it just sets the CUDA_VISIBLE_DEVICES environment
> variable based upon the position(s) in the bitmap allocated to the job or
> step.
>

> Probably the simplest way to get the correct environment variable would be
> to modify node_config_load() to cache device numbers and then use those
> device numbers rather than the bitmap index to set CUDA_VISIBLE_DEVICES
> values in job_set_env() and step_set_env()
>
> This is exactly what I'm trying to do. I'm familiarizing myself with the
code and experimenting some stuff. Unfortunately, I don't see how
information can be "transfered" from node_config_load() to job_set_env().
node_config_load() only takes the file entries as input arguments and does
not have any output variables. Also, a variable global to the gres_gpu.c
file does not work as, it seems, job_set_env() is executed as a different
process then node_config_load() and as such is not sharing memory. It might
not be exactly this situation, but the memory is definitely not shared and
thus job_set_env() cannot access variables set by node_config_load().

So either there is a simple way for that sharing of information that I did
not found, or information will have to be passed through function
arguments. But then that would change the API...

I hope I'm just missing something obvious somewhere! How is
node_config_load() supposed to configure anything if job_set_env() can't
have access to that information?

Thanks

Nicolas

Re: [slurm-dev] gres.conf's "File=" flag ignore

Reply via email to