Well, I'm almost there, but would appreciate one last tip. I was (partially) wrong with the previous explanation about why I was not getting CUDA_VISIBLE_DEVICES at prologue time. If I submit a job asking for 2 nodes, 1 GPU card each, it happens that the variable is defined on node 2 (again at prologue time), but not on node 1.
This suggests to me that some "build_env" is run only if it is a "new node"... which is something I remember to have seen while browsing the code these days. Do any of you remember where was that? Thanks in advance, Albert On 25/11/13 20:01, Albert Solernou wrote: > > Hi all, > I may need some extra help. > > I successfully modified req.c to pass to the "prolog environment" a > user environment variable that I defined, say CUDA_SET_COMPUTE_MODE. > However, I am still missing CUDA_VISIBLE_DEVICES. > > When slurmd goes through _rpc_batch_job it runs the prolog. However, > CUDA_VISIBLE_DEVICES is not there yet (the slurm_msg_t that the > function handles does not have this variable within the > req->environment). It will be later, when slurmd passes through > _rpc_launch_tasks that $CUDA_VISIBLE_DEVICES is set (in its req->env), > but now it is too late. > > Could you give me some hints on how to get CUDA_VISIBLE_DEVICES in > req.c:_rpc_batch_job? That would definitely speed things up. > > Thanks in advance, > Albert > > > > > On Wed 20 Nov 2013 14:15:12 GMT, Albert Solernou wrote: >> >> Thanks for the quick answer, Moe. >> >> I'd be trying that, and let you know. >> >> Best, >> Albert >> >> On Wed 20 Nov 2013 14:09:12 GMT, [email protected] wrote: >>> >>> Your easiest option would be to modify the Slurm code to export >>> whatever additional environment variables that you want, which should >>> be pretty simple. See the function _build_env() in >>> src/slurmd/slurmd/req.c. If you make changes and send us the patch, we >>> can include it in the canonical code base. >>> >>> Moe Jette >>> SchedMD LLC >>> >>> On 2013-11-20 05:05, Albert Solernou wrote: >>>> Hi, >>>> I'd like to write a prolog script that changes the GPU compute mode of >>>> the allocated GPU card(s). This change can only be done by root. My >>>> initial idea was that the prolog scipt would use an environment variable >>>> as a switch. >>>> >>>> The problem that I face are: >>>> - prolog or prologctld have a reduced amount of environment variables. >>>> Specifically, they miss "CUDA_VISIBLE_DEVICE" assigned by the GRes >>>> plugin, as well as any user environment flag. >>>> >>>> >>>> Is there an easy workaround? Will I have to patch the current GRes >>>> plugin or to tinker with a new spank plugin? >>>> >>>> Any help is welcome! >>>> >>>> Regards, >>>> Albert >> >> -- >> --------------------------------- >> Dr. Albert Solernou >> Research Associate >> Oxford Supercomputing Centre, >> University of Oxford >> Tel: +44 (0)1865 610631 >> --------------------------------- > > -- > --------------------------------- > Dr. Albert Solernou > Research Associate > Oxford Supercomputing Centre, > University of Oxford > Tel: +44 (0)1865 610631 > --------------------------------- > -- --------------------------------- Dr. Albert Solernou Research Associate Oxford Supercomputing Centre, University of Oxford Tel: +44 (0)1865 610631 ---------------------------------
