Howdy,
I'm wondering if anyone in the SGE community has any tips on how to accomplish
this on an SGE 6.2u5p2?
We would like to improve the balance competing of application profiles
currently active on the cluster, in particular, we need to balance queue wait
times between jobs that require many cores via MPI and those that use only a
single core.
Currently, MPI and SMP jobs tend to wait longer (the wait time goes up as the
number of slots requested) than single slot serial jobs. We have a fair share
policy in place, but even with that, serial users starve out the large # of
slot MPI users, especially large memory MPI jobs (ex: 64 slots at 13GB per slot
jobs).
We currently require users to request h_rt and vf as manadatory resource
requests. Based on our analysis of qacct, the vast majority of cluster jobs are
serial and run in less than an hour and use less that 1GB of RAM per core.
This means that these jobs, once identified, can be used very effectively to
back-fill resource gaps left by larger MPI jobs.
We have a resource quota set in place to prevent slot over subscription of the
compute nodes:
{
name slotcap
description Keep slots equal to processor cores for all exec hosts
enabled TRUE
limit hosts {*} to slots=$num_proc
}
Here's the proposal that was passed down to me and I'm looking for suggestions
on how to implement it:
* create a short.q to accept jobs with run times under 2 hours and
2G per slot memory.
This one is easy enough, create a queue: short.q and change h_rt in the queue
definition from INFINITY to 00:02:00 and vf from INFINITY to 2G. Limit the PE
list to smp
* create a largempi.q to accept 64-core, large-memory MPI jobs that
have a max runtime of 6 hours
Similar to above, largempi.q, h_rt set to 00:06:00 and vf set to 13G. Limit the
PE list to MPI pe's
But, how do I make sure that jobs request a minimum of 8 slots to prevent
serial jobs (i.e. no pe requested) or small parallel?
* assign both queues to a common hardware pool that satisfies the
resource needs of both job types (in our case, we'll use 22 nodes that each
have 48GB and 12 slots)
Create a hostgroup containing the 22 compute nodes and assign that hostgroup to
short.q and largempi.q using the "hostlist" option
* set a user limit of 100 slots in short.q to prevent a single user
from taking over the queue
Create a RQS for this:
{
name short_queue_limits
description Limit max slots for short.q
enabled TRUE
limit users {*} queues short.q to slots=100
}
Now, what am I missing to have jobs submitted to largempi.q get priority and to
ensure that the serial jobs won't squeeze out the parallel large mem jobs.
Thanks,
Mike
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users