Hello Joan, Joan Arbona <[email protected]> writes:
> Hello all, > > We have realized that in our cluster the backfill plugin is not working as we > expected. When a user submits jobs using an smaller set of nodes, always get > running before jobs with a larger set of nodes, even if these have more > priority. > > Our cluster has: > > - 1 partition of 40 nodes called THIN. 5 of them are requested by a > reservation > every day, so they are unusable. > - Default Max Time of THIN partition is 3 days (4320 minutes) > - Fairshare priority scheme > - Backfill scheduler > - Backfill parameters are all set to default. > > Lets assume the following circumstance: > > 1. User A submits jobs of 10 nodes regularly, lets say, twice or three times a > day. Those nodes are exclusive for him. He does not specify any time, so job's > max time is 3 days. > 2. User B submited one job of 30 nodes at 26th of october. This job is waiting > for user A jobs to finish. B's jobs have more priority than A's. > > The following table shows the output of smap: > > .........333333333322222222221111111111. (those numbers are JOBID in the table > below) > > > JOBID PARTITION USER NAME ST TIME NODES NODELIST > > 1 thin A gromac R 1-00:11:19 10 foner[132-141] > > 2 thin A gromac R 21:33:49 10 foner[122-131] > > 3 thin A gromac R 13:31:49 10 foner[112-121] > > 4 thin B DART_c PD 00:00:00 30 waiting... > > 5 thin A gromac PD 00:00:00 10 waiting... > > > Theorically and due to backfill , when user A finishes any of his running jobs > (1,2 or 3), although job 4 does not fit in the cluster the schedule should not > put job 5 to run. The reason is that job 5 it has less priority than job 4, > and > backfill does not alter the time of jobs with more prioirty. It should wait > until other A's jobs finish and then put job 4 to run. > > Well, this does not happen. As user A is submitting jobs all the time, they're > filling all holes that user A's jobs are leaving, because job 4 doesn't fit > (it > needs 30 nodes, not 10). Then, job 4 will never start until user A stops > sending > jobs. > > I have tried it in a test environment using sleeps. I have realized that I get > the same behavior when submitting jobs with more slurm max time (--time) than > the duration of the command (sleep time). Also, I have tried to adjust > parameters like bf_window, that is set to one day by default, without luck. > > Does anybody knows why does this happen? Why in this case the backfill > principle > of not altering jobs with more priority does not apply? Is there a way to > solve > this? > > Thanks, > Joan > > Attaching slurm.conf and the output of squeue: > > squeue --start > > > JOBID PARTITION NAME USER ST START_TIME NODES NODELIST(REASON) > > 5 thin gromacs_ A PD 2014-11-06T12:06:19 10 (Priority) > > 4 thin DART_cyc B PD 2014-11-06T22:45:49 30 (Resources) > > > In fact, job 4's start_time has been changing all the time when user A's jobs > get running. Maybe backfill can't calculate start_time accuratelly? One thing you might need to look at is the value of the scheduler parameter 'bf_window'. The default value is 1440 minutes (1 day) but it should probably be as large as your tMaxTime, i.e. SchedulerParameters=bf_window=4320 See 'man slurm.conf' for more details. Cheers, Loris -- This signature is currently under construction.
