I think that my initial question was too complex/detailed. Let me ask a
more open-ended one. Do folks have any strategies they'd like to share on
partition setups that favor paying customers while also allowing for usage
of spare resources by non-paying users? Thanks!
On Fri, 15 Jul 2016 at 3:56pm, Joshua Baker-LePain wrote
We currently run a moderately sized (5000+ cores) cluster using SGE. We're
looking to move to slurm and have a test setup, but I have some questions
about how best to implement/improve on our current setup.
Our setup is a co-op model. We have users who "own shares" of the cluster as
well as non-contributing users. We try to guarantee contributing users
access to their "share" of the cluster while also maximizing utilization via
the following setup:
o There are 3 queues on each node, and on each node each queue has a
number of slots equal to the number of real cores on the node (nodes
with hyperthreading have that feature turned on)
o Our "lab" queue is for contributing users. Jobs in this queue run
un-niced, and each lab has a number of slots in this queue equal to
their share of the cluster.
o Our "long" queue is for all users. Jobs in this queue run "nice -19".
o We also have a "short" queue for quick jobs. These jobs run at "nice
-10" and are limited to 30 minutes.
o We use np_load_avg on the queues to control oversubscription. A node
full of lab queue jobs will not launch jobs in the other queues.
However, a node full of long queue jobs can still launch lab queue
jobs, up until both lab and long queues on that node are full.
As a starting point for our new setup, I'm trying to somewhat replicate this.
Is gang scheduling what I'm looking for? Do folks have issues with jobs
continually being suspended and resumed?
Any pointers or hints would be much appreciated. And feel free to ask for
clarification and/or tell me I'm on the completely wrong track. Thanks.
--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF