Hi Moe,
Thanks for the reply. Moving the call to gres_plugin_job_set_env() is
definitely non-trivial, however I imagine a possible workaround:

 _rpc_launch_tasks in req.c runs the prolog:
  "if !slurm_cred_jobid_cached(conf->vctx, req->job_id);"
(line 1073 in req.c v. 2.6.3)
I understand that "_rpc_batch_job" runs the prolog on his own because at
"_rpc_launch_task" it returns false to this condition. If that is the
case, and there is no other reason to run the "_rpc_batch_job" prolog
earlier, then we'd be able to find a new condition so that the prolog is
only called from "_rpc_launch_task".

Could you confirm my assumption about the prolog calls? Do you have any
suggestion on how this new condition should be formulated?

Thanks,
Albert



On 27/11/13 23:08, Moe Jette wrote:
> 
> This is definitely a non-trivial change. The call to the function
> gres_plugin_job_set_env() would need to be moved from the slurmstepd
> process to the slurmd daemon (before the prolog runs) and then that
> environment variable would need to be passed to the prolog.
> 
> Moe Jette
> SchedMD LLC
> 
> Quoting Albert Solernou <[email protected]>:
> 
>>
>> Hi all,
>> I may need some extra help.
>>
>> I successfully modified req.c to pass to the "prolog environment" a
>> user environment variable that I defined, say CUDA_SET_COMPUTE_MODE.
>> However, I am still missing CUDA_VISIBLE_DEVICES.
>>
>> When slurmd goes through _rpc_batch_job it runs the prolog. However,
>> CUDA_VISIBLE_DEVICES is not there yet (the slurm_msg_t that the
>> function handles does not have this variable within the
>> req->environment). It will be later, when slurmd passes through
>> _rpc_launch_tasks that $CUDA_VISIBLE_DEVICES is set (in its req->env),
>> but now it is too late.
>>
>> Could you give me some hints on how to get CUDA_VISIBLE_DEVICES in
>> req.c:_rpc_batch_job? That would definitely speed things up.
>>
>> Thanks in advance,
>> Albert
>>
>>
>>
>>
>> On Wed 20 Nov 2013 14:15:12 GMT, Albert Solernou wrote:
>>>
>>> Thanks for the quick answer, Moe.
>>>
>>> I'd be trying that, and let you know.
>>>
>>> Best,
>>> Albert
>>>
>>> On Wed 20 Nov 2013 14:09:12 GMT, [email protected] wrote:
>>>>
>>>> Your easiest option would be to modify the Slurm code to export
>>>> whatever additional environment variables that you want, which should
>>>> be pretty simple. See the function _build_env() in
>>>> src/slurmd/slurmd/req.c. If you make changes and send us the patch, we
>>>> can include it in the canonical code base.
>>>>
>>>> Moe Jette
>>>> SchedMD LLC
>>>>
>>>> On 2013-11-20 05:05, Albert Solernou wrote:
>>>>> Hi,
>>>>> I'd like to write a prolog script that changes the GPU compute mode of
>>>>> the allocated GPU card(s). This change can only be done by root. My
>>>>> initial idea was that the prolog scipt would use an environment
>>>>> variable
>>>>> as a switch.
>>>>>
>>>>> The problem that I face are:
>>>>>  - prolog or prologctld have a reduced amount of environment
>>>>> variables.
>>>>> Specifically, they miss "CUDA_VISIBLE_DEVICE" assigned by the GRes
>>>>> plugin, as well as any user environment flag.
>>>>>
>>>>>
>>>>> Is there an easy workaround? Will I have to patch the current GRes
>>>>> plugin or to tinker with a new spank plugin?
>>>>>
>>>>> Any help is welcome!
>>>>>
>>>>> Regards,
>>>>> Albert
>>>
>>> -- 
>>> ---------------------------------
>>>   Dr. Albert Solernou
>>>   Research Associate
>>>   Oxford Supercomputing Centre,
>>>   University of Oxford
>>>   Tel: +44 (0)1865 610631
>>> ---------------------------------
>>
>> -- 
>> ---------------------------------
>>   Dr. Albert Solernou
>>   Research Associate
>>   Oxford Supercomputing Centre,
>>   University of Oxford
>>   Tel: +44 (0)1865 610631
>> ---------------------------------
>>
> 

-- 
---------------------------------
  Dr. Albert Solernou
  Research Associate
  Oxford Supercomputing Centre,
  University of Oxford
  Tel: +44 (0)1865 610631
---------------------------------

Reply via email to