Well,
I'm almost there, but would appreciate one last tip.

I was (partially) wrong with the previous explanation about why I was
not getting CUDA_VISIBLE_DEVICES at prologue time. If I submit a job
asking for 2 nodes, 1 GPU card each, it happens that the variable is
defined on node 2 (again at prologue time), but not on node 1.

This suggests to me that some "build_env" is run only if it is a "new
node"... which is something I remember to have seen while browsing the
code these days.

Do any of you remember where was that?

Thanks in advance,
Albert

On 25/11/13 20:01, Albert Solernou wrote:
> 
> Hi all,
> I may need some extra help.
> 
> I successfully modified req.c to pass to the "prolog environment" a 
> user environment variable that I defined, say CUDA_SET_COMPUTE_MODE. 
> However, I am still missing CUDA_VISIBLE_DEVICES.
> 
> When slurmd goes through _rpc_batch_job it runs the prolog. However, 
> CUDA_VISIBLE_DEVICES is not there yet (the slurm_msg_t that the 
> function handles does not have this variable within the 
> req->environment). It will be later, when slurmd passes through 
> _rpc_launch_tasks that $CUDA_VISIBLE_DEVICES is set (in its req->env), 
> but now it is too late.
> 
> Could you give me some hints on how to get CUDA_VISIBLE_DEVICES in 
> req.c:_rpc_batch_job? That would definitely speed things up.
> 
> Thanks in advance,
> Albert
> 
> 
> 
> 
> On Wed 20 Nov 2013 14:15:12 GMT, Albert Solernou wrote:
>>
>> Thanks for the quick answer, Moe.
>>
>> I'd be trying that, and let you know.
>>
>> Best,
>> Albert
>>
>> On Wed 20 Nov 2013 14:09:12 GMT, [email protected] wrote:
>>>
>>> Your easiest option would be to modify the Slurm code to export
>>> whatever additional environment variables that you want, which should
>>> be pretty simple. See the function _build_env() in
>>> src/slurmd/slurmd/req.c. If you make changes and send us the patch, we
>>> can include it in the canonical code base.
>>>
>>> Moe Jette
>>> SchedMD LLC
>>>
>>> On 2013-11-20 05:05, Albert Solernou wrote:
>>>> Hi,
>>>> I'd like to write a prolog script that changes the GPU compute mode of
>>>> the allocated GPU card(s). This change can only be done by root. My
>>>> initial idea was that the prolog scipt would use an environment variable
>>>> as a switch.
>>>>
>>>> The problem that I face are:
>>>>  - prolog or prologctld have a reduced amount of environment variables.
>>>> Specifically, they miss "CUDA_VISIBLE_DEVICE" assigned by the GRes
>>>> plugin, as well as any user environment flag.
>>>>
>>>>
>>>> Is there an easy workaround? Will I have to patch the current GRes
>>>> plugin or to tinker with a new spank plugin?
>>>>
>>>> Any help is welcome!
>>>>
>>>> Regards,
>>>> Albert
>>
>> --
>> ---------------------------------
>>   Dr. Albert Solernou
>>   Research Associate
>>   Oxford Supercomputing Centre,
>>   University of Oxford
>>   Tel: +44 (0)1865 610631
>> ---------------------------------
> 
> --
> ---------------------------------
>   Dr. Albert Solernou
>   Research Associate
>   Oxford Supercomputing Centre,
>   University of Oxford
>   Tel: +44 (0)1865 610631
> ---------------------------------
> 

-- 
---------------------------------
  Dr. Albert Solernou
  Research Associate
  Oxford Supercomputing Centre,
  University of Oxford
  Tel: +44 (0)1865 610631
---------------------------------

Reply via email to