SLURM Devs,
This is probably a FAQ whose answer is "nope" but my search-fu has
failed me. We recently had a need to think about something. This is
going to be a generic experiment because I don't want to have to
remember all the details of the real names of qos, etc.
Namely, on our cluster, lets say we have three ways to run:
1. --partition=limited
2. --qos=high
3. Default
Number one is a partition that not many can submit to, is a dedicated
chunk of the cluster, but one can only run 3 jobs in it.
Number two is a qos with a high priority in the "general" "default"
partition of the machine. This might have a limit on number of jobs
(let's say 6, though I don't know if there is a limit) so people don't
abuse it.
Number three is when you just sbatch and get whatever the default is.
Obviously, #1 is the gold standard, run until you limit out; #2 is
better, and #3 is least attractive.
Now, we have a situation where an experiment needs to run, say 12 jobs
that take 3 hours each. If we had our druthers, we'd submit all 12 to #1
and all 12 would launch at once. Can't do that. You get only 3 in. So
now go to #2, only get 6 in (assuming the general cluster partition
isn't full). If you limit out of #2, then fall over to #3.
I think you get what I want. I'd love to have a single sbatch call that
says:
Take this job and submit such that it runs under #1, #2, #3, and
whatever can take it first wins.
In our case, I can see 3 perhaps getting in right away into #1, a few
more a bit later in #2 and then the next ones maybe when #1 is free
again, or perhaps #3... I know the --constraint has a nice OR operator,
but I'm not sure anything else does.
Now, one way we can think to do this (since I don't know if you can do
the above) is to submit 12 jobs to *each* queue-config possibility and
then underneath, have a lockfile-managed script that holds a MasterList
of all the possible jobs. If someone manages to get an allocation, that
one pops a job off the MasterList, now there are 11 left, and so on.
Once the MasterList is empty (aka all jobs run or running), you could
then clean up all the queued jobs that never will run anything useful
(and if they get an allocation, the empty MasterList would just return
the allocation immediately).
We have experience with this lock and masterlist (for other purposes),
so we can do it, but as I said, it'd be nice if we could make one big
meta sbatch call. Because it's nice to only have 12 jobs in the queue
instead of 36 :)
Matt
--
Matt Thompson, SSAI, Sr Scientific Programmer/Analyst
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-614-6712 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson