I will just add that Slurm has wrappers for the common PBS user
commands, recognizes the #PBS options in the batch scripts, and can
set PBS environment variables as well.
The conversion should be relatively transparent to users using the
appropriate Slurm plugins and packages.
Moe Jette
SchedMD
Quoting Chris Read <[email protected]>:
After trying a few job managers and slurm configurations we've settled
on a single partition with a hand full of QOS defined. We use the QOS
for time limits and giving specific classes of jobs higher priority. 2
QOS get most of the jobs, and we use FairShare to keep everyone happy.
We have everything in a single group right now, but from what I can
see it should scale out nicely to multiple groups too.
On the flip side though I have no experience at all of PBS...
On Thu, Jan 9, 2014 at 9:34 AM, Bill Wichser <[email protected]> wrote:
After years and years of PBS use, it is time to modernize. Speaking with a
few of the developers at SC13 we have started the switch on two new clusters
soon to be deployed and will not install Torque/Moab on these but will
attempt Slurm instead.
Naturally, things are quite different. I've managed to implement the
job_submit.lua script to emulate a routing queue similar to PBS keyed on job
request time.
But instead of trying to simply convert what I have with my current setup,
maybe there is a better way. For instance, a single partition and a QOS
defined for time_limit lengths instead. Obviously there are many ways to
skin the cat the same as there are for other resource managers/schedulers.
What I am hoping to find is just some solid advice, from you folks who are
running Slurm. I need some fairshare stuff for groups and limits for users
and total jobs of length T and that's about it for now. And while this
could all be implemented in a variety of ways, is there something I should
be aware of in the overall layout to make this easier down the road or
should I just continue to port things from the years of doing PBS?
Sincerely,
Bill Wichser