Am 22.12.2017 um 23:55 schrieb [email protected]: > In the message dated: Thu, 21 Dec 2017 23:17:52 +0100, > The pithy ruminations from Reuti on > <Re: [gridengine users] resource types -- changing BOOL to INT but keeping > qsub unchanged> were: > => Hi, > => > => Am 21.12.2017 um 22:46 schrieb [email protected]: > => > > => > > => > I'm considering changing "gpu" to an INT (set to the number of > GPUs/node), > => > making it a consumable resource, and updating our JSV (in perl) so that > => > if the job is submitted as > => > > => > qsub -l gpu foobar > => > > => > it will be altered to the equivalent of > => > > => > qsub -l gpu=1 foobar > => > > => > to keep things easy for users. > => > > => > Any suggestions about this plan? > => > => Even with "-w n" you will face a "missing value for request" I fear, as > it's AFAIK checked before the JSV will be called*. I had the idea in the past > to change the default value for an integer request without a number to one > (it's quiet easy to find in the source where the BOOL without a value is > expanded) but it was denied. > => > > Well, I tried the changes: > > qconf -sc | grep gpu > gpu cuda INT <= YES JOB > 0 1000 > > > And submitted a job: > > qsub -l gpu ./smi.qsub > > And it seems to have been accepted by qsub (note the change to "gpu=1" from > our JSV): > > qstat -j 737215|grep gpu > hard resource_list: gpu=1,h_vmem=4g,h_stack=256m > > > Perhaps the "missing value for request" check only applies to certain > SGE versions? I left out mentioning that we're running SoGE 8.1.6.
Aha, interesting. This might have been changed in SoGE. As you have this running: can you please output the value of gpu before you assigned a value? Was it just 0 or already set to 1 as default? -- Reuti > > > => But: do you need to know which GPU will be used? Univa GE has a named > > Yeah, that was going to be another post. > > => resource. With SGE it might help to have one queue with one slot per GPU, > => and from the name (i.e. suffix) of the granted queue name you know which > => GPU you have to use. > > True, but even with that info, there doesn't seem to be any universal > way to tell an arbitrary GPU job which GPU to use -- they all default > to device 0. > > Our likely solution will be to install 1 GPU/node, except for a few nodes > with multiple GPUs where any job requesting that node gets all GPUs, > and the job is expected to manage the multiple devices. > > Thanks, > > Mark > > => > => -- Reuti > => > => *) The "-w e" check will even be performed twice: one time before the JSV > and one time after. This is to my opinion not optimal, as it prohibits to > submit a completely malformed request and put things in order inside the JSV. > Sure, one problem are the fields which are feed to the JSV. How to express a > missing integer value (besides the IEEE ways like NaN and alike). > => > => > => > > => > Thanks, > => > > => > Mark > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
