We've been using a backfill priority partition for people doing HTC work.  We have requeue set so that jobs from the high priority partitions can take over.

You can do this for your interactive nodes as well if you want. We dedicate hardware to interactive work and use Partition based QoS's to limit usage.

-Paul Edmon-


On 05/08/2018 10:08 AM, Renfro, Michael wrote:
That’s the first limit I placed on our cluster, and it has generally worked out 
well (never used a job limit). A single account can get 1000 CPU-days in 
whatever distribution they want. I’ve just added a root-only ‘expedited’ QOS 
for times when the cluster is mostly idle, but a few users have jobs that run 
past the TRES limit. But I really like the idea of a preemptable QOS that the 
users can put their extra jobs into on their own.



Reply via email to