[slurm-dev] Re: slurm job priorities depending on the number of each user's jobs

Cooper, Trevor Wed, 07 Oct 2015 16:25:22 -0700

Chris (et al.),

This is what we tried but found that it doesn't handle (well) the case where 
one (or more) of the bf_max_job_user jobs being considered is an array job.


In that case Slurm starts array_tasks until the user hits a QOS enforced 
resource limit (like GrpCPUs or MaxCPUSPerUser).

With array jobs it's still possible (within the resource limitations) to have 
one or more users consume more than the expected share of resources when 
considering a 'job' as the unit of measure.

-- Trevor

> On Oct 6, 2015, at 8:08 PM, Christopher Samuel <[email protected]> wrote:
> 
> 
> On 06/10/15 00:18, Dr. Markus Stöhr wrote:
> 
>> If such a bunch of jobs has highest priority, nearly all of them might
>> start simultanously, not allowing the start of jobs of other users.
> 
> Our solution to this is:
> 
> 1) all jobs go into backfill (defer)
> 2) backfill can only start 5 users jobs at a time (bf_max_job_user=5)
> 3) go through the whole queue (bf_max_job_start=10000)
> 4) continue backfill where you left off (bf_continue)
> 5) we limit the number of cores an account can use on a cluster (grpcpus)
> 
> The first 4 are all SchedulerParameters in slurm.conf, the last
> is set on accounts via sacctmgr.
> 
> How's that?
> 
> All the best,
> Chris
> -- 
> Christopher Samuel        Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: [email protected] Phone: +61 (0)3 903 55545
> http://www.vlsci.org.au/      http://twitter.com/vlsci

[slurm-dev] Re: slurm job priorities depending on the number of each user's jobs

Reply via email to