Hi,

After increasing the log level I could see lots of messages like:

backfill: completed yielding locks

Also, sdiag said that the backfilling cycle was taking around 160 seconds.

I'll try changing the SchedulerParameters as suggested and see if this helps.

Thanks,
Andrew.

________________________________
From: Tim Carlson [[email protected]]
Sent: Wednesday, May 22, 2013 5:51 PM
To: slurm-dev
Subject: [slurm-dev] Re: Problems when using sched/backfill

We have a similar setup and this is our current setup. Without tuning these, 
you are in a world of hurt with your job mix and doing backfill.

SchedulerParameters=default_queue_depth=50,bf_interval=120,bf_window=300,bf_max_job_user=60

The bf_max_job_user is key for us.


On Tue, May 21, 2013 at 3:10 PM, Carles Fenoy 
<[email protected]<mailto:[email protected]>> wrote:

Hi all,
Use sdiag to see if the backfilling is too slow. If it is, tune the scheduler 
parameters. There is a bf_max_jobs or something like this that will limit the 
number of jobs evaluated and will decrease considerably the scheduling time
Regards,
Carles Fenoy
Barcelona Supercomputing Center

El 21/05/2013 23:15, "Bjørn-Helge Mevik" 
<[email protected]<mailto:[email protected]>> escribió:



If you increase the log level, for instance set

SlurmctldDebug=debug
DebugFlags=Backfill

you might get more information about what happens.  If it is the
backfilling that takes too long, you should see messages about backfill
"yielding locks".  If I recall correctly, the backfill scheduler used to
time out after MessageTimeout/2 seconds, but looking at the code for
2.5.6 this seems to have changed.

Keep us posted about what you find.  I'm planning to switch to 2.5.6
tomorrow, and have from time to time had problems getting the
backfilling to be fast enough.

--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo



--
Scanned by iCritical.

Reply via email to