On Tue, 2016-06-07 at 05:43:19 -0700, Steffen Grunewald wrote:
>
> Good afternoon,
>
> I'm looking for a (simple) set of preemption rules for the following planned
> setup:
>
> - three partitions: "urgent", "normal", "cycles", all covering the same set
> of nodes
> - "cycles" p. runs at Priority=1, jobs can be preempted (CANCEL or REQUEUE)
> - "normal" p. at Priority=10, can*not* preempt jobs, jobs can*not* be
> preempted
> - "urgent" p. AllowAccounts=gods Priority=100 can preempt "cycles" jobs *only*
>
> Which (global) PreemptType and (individual) PreemptMode settings would
> provide this?
> I'm running into conflicts endlessly... Anyone running a similar setup and
> willing to
> share a couple of slurm.conf lines?
I found something (on hpckp.org) that works (almost) completely outside of
slurm.conf,
involving qos and PreemptType=preempt/qos:
# sacctmgr show qos format=Name,Preempt,PreemptMode
Name Preempt PreemptMode
---------- ---------- -----------
normal cluster
cycles cluster
urgent cycles cancel
I hope that this will keep the priotities in order, and cancel only the right
jobs,
also the accounting needs more details (how to keep the "gods" from running all
their
stuff in the "urgent" partition?)
> As a bonus: since "normal" jobs are expected to spawn continuation jobs at
> the end of
> their run-time, would it be possible to delay jobs to be run in the "cycles"
> partition
> until nodes have been idle for more than 1-5 minutes? (Dynamically set the
> partition
> state to INACTIVE? How?)
Is there a way to access the timestamp of a node's latest state change, or the
age of
the current state (!=DRAIN)?
- S