We use a multiple overlapping partition setup as follows:
"Paying" customers get a high-priority allocation account which
can submit to the high-priority partition, replenished monthly.
"non-paying" customers can get (by application to a faculty committee)
allocation accounts which can submit to the standard (standard priority)
partition. ("Paying" customers also get a standard priority allocation
account with which they can effectively "borrow" from future/past months
within the quarter at standard priority).
And finally, all customers can submit (without charges to any allocation
account) to a scavenger queue. Jobs in this partition have no maximum
walltime but are subject to preemption by jobs in standard or high-priority
partitions. This partition also has the lowest priority.
We use QOSes to further prioritize jobs based on width and walltime.
The "paying" customers also have access to some QOSes with
longer walltime limits.
For the most part, all of the above partitions consist of the same compute
nodes.
So jobs of "paying" customers can cut ahead of jobs of "non-paying" and
"scavenger"
in the queue, and will bump "scavenger" jobs. They will have to wait for
standard
priority jobs that are already running, but these have stricter walltime limits
(although can still run for several days). Jobs on the standard priority partition
will cut ahead of scavenger jobs in the queue, and bump running scavenger jobs
if
needed. And those that can live with being preempted, etc. can submit scavenger
jobs to use up any spare CPU cycles w/out having their allocations charged.
On Mon, 25 Jul 2016, Christopher Samuel wrote:
On 26/07/16 06:48, Joshua Baker-LePain wrote:
I think that my initial question was too complex/detailed. Let me ask a
more open-ended one. Do folks have any strategies they'd like to share
on partition setups that favor paying customers while also allowing for
usage of spare resources by non-paying users? Thanks!
Not yet, but it's something I'm trying to grapple with at the moment.
For now the dedicated nodes are, well, dedicated to those people.
However, when they're back from their travels I want to talk about
having an overlapping partition across all nodes for short running jobs.
That partition will have a lower priority and so should only keep their
nodes busy when they're not using them. The upper time bound on that
partition will necessarily be the longest that they are willing to wait
for a job to start (and could possibly be changed on the fly).
If we can arrange that then we can arrange for all jobs to be submitted
to all partitions (Slurm will prune any forbidden ones for us) and so
(hopefully) everyone will win.
cheers,
Chris
--
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
http://www.vlsci.org.au/ http://twitter.com/vlsci
Tom Payerle
IT-ETI-EUS paye...@umd.edu
4254 Stadium Dr (301) 405-6135
University of Maryland
College Park, MD 20742-4111