We run a render farm submitting lots of array jobs with large numbers
of tasks.  There seem to be lots of ways of throttling array jobs and
sharing across users, but I'm looking for a way to guarantee a minimum
allocation rather than a maximum.

An example:

10 machines with X slots each
15 users (a,b,c,d,...o) submit array jobs with 1000 tasks, each of
which requires X slots.  We'll assume these array jobs are submitted
close enough to each other that they all overlap.

At the moment, fair share/functional will have some of everyone's
tasks run in an interleaved fashion meaning that 'a' might have tasks
1-300 run and then sit there waiting for a long time before 301-600
run and then maybe another wait before the remaining tasks are run.

The behaviour we're looking for is:

Once an array job starts running, it should not stop running until
it's finished.  At least one task from that array job should be
running at any given time.  So in the example above, the 11th array
job to start (determined by priority) would not begin until one of the
first 10 had finished.  Each job would be guaranteed a minimum of at
least X slots at any given time (determined by the PE submission
option).

When not using array jobs, you can just use the FIFO scheduler, but I
haven't yet found a way to get to this with array tasks.

If anyone's got any ideas it'd be much appreciated,

-- 
Stephen

http://lensframephoto.com
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to