We've been in production with SLURM 14.03.10 for a few weeks now and have
found our backfill configuration to be lacking.  Currently there are 1500
jobs in a pending state each requesting 4 CPUs and only 4 hours of
walltime.  We have approximately 1500 CPUs total that are idle and looking
at the other jobs that are pending I can not see why these are not being
backfilled.  My theory is that our backfill limits are too low and the same
jobs keep being reevaluated for backfill and the others are ignored during
each cycle.

Currently this is one user who has the 1500 pending jobs and the reasons in
squeue is either (Resources) , (Priority) with the vast majority being
(None).  This is our current SchedulerParameters:

SchedulerParameters=bf_max_job_user=35,bf_max_job_test=400,bf_interval=60,sched_interval=120,default_queue_depth=10,partition_job_depth=100,bf_window=7200,bf_resolution=1800,bf_continue,max_sched_time=4,defer,preempt_strict_order

My goal was to keep one user from overloading slurmctld with backfill
requests but to still allow efficient backfill when the cluster has idle
CPUs.  I had been using max_job_test=100 and noticed backfill was never
taking place so increased to 300.  This worked for about a week and now we
see no backfill taking place.  This is the sdiag output which unsure how to
turn into useful information:

$ sdiag
*******************************************************
sdiag output at Fri Nov 21 12:32:01 2014
Data since      Thu Nov 20 18:00:00 2014
*******************************************************
Server thread count: 3
Agent queue size:    0

Jobs submitted: 471
Jobs started:   338
Jobs completed: 320
Jobs canceled:  81
Jobs failed:    0

Main schedule statistics (microseconds):
        Last cycle:   77565
        Max cycle:    117189
        Total cycles: 907
        Mean cycle:   52777
        Mean depth cycle:  211
        Cycles per minute: 0
        Last queue length: 1898

Backfilling stats
        Total backfilled jobs (since last slurm start): 9529
        Total backfilled jobs (since last stats cycle start): 271
        Total cycles: 1108
        Last cycle when: Fri Nov 21 12:31:36 2014
        Last cycle: 502468
        Max cycle:  507446
        Mean cycle: 278070
        Last depth cycle: 1926
        Last depth cycle (try sched): 136
        Depth Mean: 1908
        Depth Mean (try depth): 96
        Last queue length: 1926
        Queue length mean: 1908

The user with 1500 pending jobs has a fairshare value of 0.000000.  Is it
the case that this person's jobs are considered last for backfill based on
priority? (reading sdiag man page seems to hint that the cycle goes by job
priority order).

The system running slurmctld is a virtual machine with 4 CPUs and 4GB of
memory.  I'd be interested to know other's experiences with tuning backfill
especially in the context of not overloading slurmctld.

Thanks,
- Trey

=============================

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: [email protected]
Jabber: [email protected]

Reply via email to