"Wiegand, Paul" <wieg...@ist.ucf.edu> writes:

> But I'd like to add another queue (partition) that is has a lower
> priority and is preemptable that users can use even if the account has
> hit its limit for the month.

I'm not sure this is the "correct" way, but this is how we do it.  Note:
we are using a slightly old version of slurm (14.03), and newer/better
options might have arrived since then.

The trick is to put the limits on a qos, and not on the account.

Each of our accounts has an associated default qos, and the qos has
limits like grpcpu, grpmem and grpcpumin.  For instance, an account
"math" would get its own qos "math" as the default qos.  (In some cases,
if several (sub) accounts should share the same resources, perhaps with
different fair shares, they have the same default qos.

Then we have a partition called "lowpri", and a qos "lowpri", without
any limits and with lower priority than the other qos'es.  All other
qos'es are allowed to preempt jobs in the "lowpri" qos.  When a user
wants to run a lowpri job, she specifies the --qos=lowpri (in addition
to the usual --account).  In a job_submit plugin, we then automatically
sets the partition to "lowpri".

Since the lowpri jobs are running in a different qos than the one which
has limits, they will not count when checking the usage against the
limit.

(Actually, it is not needed to have a separate partition; the qos is
enough.  We have the partition because our lowpri jobs are allowed to
run on special nodes (like hugemem or accellerator nodes) that normal
jobs are not allowed to use.)

I hope this made sense to you. :)

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

Reply via email to