On 06/14/2013 01:33 PM, Alan V. Cowles wrote:
> Hey fellow Slurm users:
>
> I have just a quick question to see what other slurm users are using for
> their queue size. We had the default queue of 10,000 jobs in place, and
> a user submitted 18,000 which basically has made the entire system
> unusable for anybody else for the past 3 days, as many of her jobs are
> 24 hour plus. We plan to mitigate this by increasing the queue so that
> no "reasonable" amount of jobs are refused, but enabling a fair share
> factor model of priority so abusive users get trumped by non-abusive
> users when their jobs are queued. With our cluster being quite small, @
> 640 total slots, we didn't want to set the queue size to ~10,000,000 or
> something stupid... though that would likely solve our issue with jobs
> not getting into the queue, even at the default value of 10,000 we are
> "oversubscribed" on queue vs slots to the tune of 16 to 1. Thoughts,
> observations or suggestions from other SLURM users would be very much
> appreciated.
>
> AC

Hi Alan,

We use sacctmgr to set a limit of 7000 jobs for each user.
This is the "MaxSubmit" limit.

We allow a total of 100000 jobs: MaxJobCount=100000

We do not use fairshare. Instead we use a simple FIFO algorithm,
together with a cron script that gives a few queueing jobs for
each user a high priority boost, and increments the high priority
even a little more each time it runs. (So queue time increments
are simulated to start anew, when the job is elevated to the
high priority level. SLURM does not add any more queue time
increments when you have set a priority with scontrol, and for
this purpose this is a good thing.)

In that way our batch queue is a two-step thing: Basically you
get priority according to how long you have queued, but most
job that is getting started have also experienced the high
priority boost.

So when you have a user that submits 7000 jobs, just a few
a those at a time will have gone through the priority boost.
In this way no single user can monopolize the batch queue.

We give at least one job for each user this high priority
level. As long as the high priority jobs for the user do not
add up to more than 64 cores and at the same time do not
add up to more than 2688 core hours, we allow more of them
to elevate to the higher priority level. Those limits work
for us on systems with a few thousand cores.

The cron script works in some ways like an external
scheduler (like Maui or Moab), but is probably too slow for
a cluster with a very lot of nodes.

Best wishes,
-- Lennart Karlsson, UPPMAX, Uppsala University, Sweden
    http://uppmax.uu.se

Reply via email to