Re: [gridengine users] preventing certain jobs from being suspended (subordinated)

2019-09-05 Thread Reuti


> Am 05.09.2019 um 13:57 schrieb Tina Friedrich :
> 
> We had this problem lots, and I can't quite remember how I solved it - I 
> think it might've been either a JSV or a qsub wrapper that shoves all 
> GPU jobs into the superordinate queue.
> 
> Now that I'm thinking about this again - does the subordinate queue 
> setting accept 'queueu@@hostgroup' syntax like everything else? Don't 
> remember if I ever tried that.

Yes, one can limit it to be available on certain machines only:

subordinate_list  NONE,[@intel2667v4=short]

-- Reuti


> Tina
> 
> On 04/09/2019 21:52, Reuti wrote:
>> 
>> Am 04.09.2019 um 21:58 schrieb berg...@merctech.com:
>> 
>>> Our SoGE (8.1.6) configuration has essentially two queues: one for "all"
>>> jobs and one for "short jobs". The all.q is subordinate to the short.q,
>>> and short jobs can suspend a job in the general queue. At the moment, the
>>> all.q has nodes with & without GPU resources (not ideal, not permanent,
>>> probably to be replaced in the future with multiple queues, but it's
>>> what we have now).
>>> 
>>> Our GPU jobs do not stop or free resources when suspended (OK, the CPU
>>> portion may respond correctly to SIGSTOP, but the GPU portion keeps
>>> running).
>>> 
>>> Is there any way, with our current number of queues, to exempt jobs
>>> using a GPU resource complex (-l gpu) from being suspended by short jobs?
>> 
>> Not that I'm aware of. Almost 10 years ago I had a similar idea:
>> 
>> https://arc.liv.ac.uk/trac/SGE/ticket/735
>> 
>> -- Reuti
>> 
>> ___
>> users mailing list
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
>> 
> 
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users

___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


Re: [gridengine users] preventing certain jobs from being suspended (subordinated)

2019-09-05 Thread Tina Friedrich
We had this problem lots, and I can't quite remember how I solved it - I 
think it might've been either a JSV or a qsub wrapper that shoves all 
GPU jobs into the superordinate queue.

Now that I'm thinking about this again - does the subordinate queue 
setting accept 'queueu@@hostgroup' syntax like everything else? Don't 
remember if I ever tried that.

Tina

On 04/09/2019 21:52, Reuti wrote:
> 
> Am 04.09.2019 um 21:58 schrieb berg...@merctech.com:
> 
>> Our SoGE (8.1.6) configuration has essentially two queues: one for "all"
>> jobs and one for "short jobs". The all.q is subordinate to the short.q,
>> and short jobs can suspend a job in the general queue. At the moment, the
>> all.q has nodes with & without GPU resources (not ideal, not permanent,
>> probably to be replaced in the future with multiple queues, but it's
>> what we have now).
>>
>> Our GPU jobs do not stop or free resources when suspended (OK, the CPU
>> portion may respond correctly to SIGSTOP, but the GPU portion keeps
>> running).
>>
>> Is there any way, with our current number of queues, to exempt jobs
>> using a GPU resource complex (-l gpu) from being suspended by short jobs?
> 
> Not that I'm aware of. Almost 10 years ago I had a similar idea:
> 
> https://arc.liv.ac.uk/trac/SGE/ticket/735
> 
> -- Reuti
> 
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
> 

___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users