Hi all,
could this condition be "if (req->job_step_id == 0)" instead of "if
(first_job_run)" in "_rpc_launch_tasks"? Then the call to the prolog in
"_rpc_batch_job" could be removed.

Aparently works OK for me, but I don't know if this would affect any
abnormal job.

Could you confirm if this would work for every case?

Best,
Albert

On 28/11/13 12:14, Albert Solernou wrote:
> 
> Hi Moe,
> Thanks for the reply. Moving the call to gres_plugin_job_set_env() is
> definitely non-trivial, however I imagine a possible workaround:
> 
>  _rpc_launch_tasks in req.c runs the prolog:
>   "if !slurm_cred_jobid_cached(conf->vctx, req->job_id);"
> (line 1073 in req.c v. 2.6.3)
> I understand that "_rpc_batch_job" runs the prolog on his own because at
> "_rpc_launch_task" it returns false to this condition. If that is the
> case, and there is no other reason to run the "_rpc_batch_job" prolog
> earlier, then we'd be able to find a new condition so that the prolog is
> only called from "_rpc_launch_task".
> 
> Could you confirm my assumption about the prolog calls? Do you have any
> suggestion on how this new condition should be formulated?
> 
> Thanks,
> Albert
> 
> 
> 
> On 27/11/13 23:08, Moe Jette wrote:
>>
>> This is definitely a non-trivial change. The call to the function
>> gres_plugin_job_set_env() would need to be moved from the slurmstepd
>> process to the slurmd daemon (before the prolog runs) and then that
>> environment variable would need to be passed to the prolog.
>>
>> Moe Jette
>> SchedMD LLC
>>
>> Quoting Albert Solernou <[email protected]>:
>>
>>>
>>> Hi all,
>>> I may need some extra help.
>>>
>>> I successfully modified req.c to pass to the "prolog environment" a
>>> user environment variable that I defined, say CUDA_SET_COMPUTE_MODE.
>>> However, I am still missing CUDA_VISIBLE_DEVICES.
>>>
>>> When slurmd goes through _rpc_batch_job it runs the prolog. However,
>>> CUDA_VISIBLE_DEVICES is not there yet (the slurm_msg_t that the
>>> function handles does not have this variable within the
>>> req->environment). It will be later, when slurmd passes through
>>> _rpc_launch_tasks that $CUDA_VISIBLE_DEVICES is set (in its req->env),
>>> but now it is too late.
>>>
>>> Could you give me some hints on how to get CUDA_VISIBLE_DEVICES in
>>> req.c:_rpc_batch_job? That would definitely speed things up.
>>>
>>> Thanks in advance,
>>> Albert
>>>
>>>
>>>
>>>
>>> On Wed 20 Nov 2013 14:15:12 GMT, Albert Solernou wrote:
>>>>
>>>> Thanks for the quick answer, Moe.
>>>>
>>>> I'd be trying that, and let you know.
>>>>
>>>> Best,
>>>> Albert
>>>>
>>>> On Wed 20 Nov 2013 14:09:12 GMT, [email protected] wrote:
>>>>>
>>>>> Your easiest option would be to modify the Slurm code to export
>>>>> whatever additional environment variables that you want, which should
>>>>> be pretty simple. See the function _build_env() in
>>>>> src/slurmd/slurmd/req.c. If you make changes and send us the patch, we
>>>>> can include it in the canonical code base.
>>>>>
>>>>> Moe Jette
>>>>> SchedMD LLC
>>>>>
>>>>> On 2013-11-20 05:05, Albert Solernou wrote:
>>>>>> Hi,
>>>>>> I'd like to write a prolog script that changes the GPU compute mode of
>>>>>> the allocated GPU card(s). This change can only be done by root. My
>>>>>> initial idea was that the prolog scipt would use an environment
>>>>>> variable
>>>>>> as a switch.
>>>>>>
>>>>>> The problem that I face are:
>>>>>>  - prolog or prologctld have a reduced amount of environment
>>>>>> variables.
>>>>>> Specifically, they miss "CUDA_VISIBLE_DEVICE" assigned by the GRes
>>>>>> plugin, as well as any user environment flag.
>>>>>>
>>>>>>
>>>>>> Is there an easy workaround? Will I have to patch the current GRes
>>>>>> plugin or to tinker with a new spank plugin?
>>>>>>
>>>>>> Any help is welcome!
>>>>>>
>>>>>> Regards,
>>>>>> Albert
>>>>
>>>> -- 
>>>> ---------------------------------
>>>>   Dr. Albert Solernou
>>>>   Research Associate
>>>>   Oxford Supercomputing Centre,
>>>>   University of Oxford
>>>>   Tel: +44 (0)1865 610631
>>>> ---------------------------------
>>>
>>> -- 
>>> ---------------------------------
>>>   Dr. Albert Solernou
>>>   Research Associate
>>>   Oxford Supercomputing Centre,
>>>   University of Oxford
>>>   Tel: +44 (0)1865 610631
>>> ---------------------------------
>>>
>>
> 

-- 
---------------------------------
  Dr. Albert Solernou
  Research Associate
  Oxford Supercomputing Centre,
  University of Oxford
  Tel: +44 (0)1865 610631
---------------------------------

Reply via email to