Hi,

Am 08.05.2014 um 14:32 schrieb Roberto Nunnari:

> Il 06.05.2014 21:58, Reuti ha scritto:
>> Hi,
>> 
>> Am 06.05.2014 um 18:45 schrieb Roberto Nunnari:
>> 
>>> I'm running a small cluster using Oracle Grid Engine 6.2u7
>>> 
>>> At times it happens that one user submits a job that requires several 
>>> resources (-pe, -l mem_free, etc).
>>> 
>>> For instance, user A submits a job X requiring 32 slots out of 100 
>>> available.
>>> The other users, keeps submitting serial jobs filling up all the slots and 
>>> always having more jobs waiting on the queue.
>>> 
>>> The serial jobs will get ahead of job X, and be scheduled as soon as one 
>>> slot is available and job X will be waiting in the queue forever and never 
>>> get to run until no more serial jobs will be submitted and 32 slots will be 
>>> available.
>>> 
>>> I would like the scheduler to also consider how much the job has been 
>>> waiting in the queue, and possibly also the values regarding the historic 
>>> users resources usage, as returned by qacct -o username
>>> 
>>> What are the possible solutions to solve this problem?
>> 
>> You can also look into Resource Reservation, so that the parallel job 
>> collects the necessary resources while waiting:
> 
> That looks more promising.. I guess the user has to know that he has to use 
> the -R y flags, .right?

Correct. Therefore the idea to use a JSV to attach it to parallel jobs only in 
your set up. Besides this, "-R y" can also be used to collect the required 
memory for a serial job as it could face the same issue while waiting for a 
huge bunch.


> What happens if I set max_reservation to 32 and the user submits a job (using 
> -R y) requiring 64 slots?

The max_reservation is not a limit for any requested resource, but the number 
of jobs which are considered to have reservations. You can also have a look at 
`man sched_conf` about this entry for more details. If you face this issue only 
intermittently you could start by setting it to a smaller value like 4 or so.

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to