Thanks, I repeated my tests more
   accuratelly and it finally worked. As you said it was because
   bf_window was not configured properly.
   
   Greetings,
   Joan
   
   On 04/11/14 14:19, Loris Bennett wrote:
Hello Joan,

Joan Arbona <[email protected]> writes:
     Hello all,

We have realized that in our cluster the backfill plugin is not working as we
expected. When a user submits jobs using an smaller set of nodes, always get
running before jobs with a larger set of nodes, even if these have more
priority.

Our cluster has:

- 1 partition of 40 nodes called THIN. 5 of them are requested by a reservation
every day, so they are unusable.
- Default Max Time of THIN partition is 3 days (4320 minutes)
- Fairshare priority scheme
- Backfill scheduler
- Backfill parameters are all set to default.

Lets assume the following circumstance:

1. User A submits jobs of 10 nodes regularly, lets say, twice or three times a
day. Those nodes are exclusive for him. He does not specify any time, so job's
max time is 3 days.
2. User B submited one job of 30 nodes at 26th of october. This job is waiting
for user A jobs to finish. B's jobs have more priority than A's.

The following table shows the output of smap:

.........333333333322222222221111111111. (those numbers are JOBID in the table
below)
 JOBID PARTITION USER NAME    ST  TIME        NODES NODELIST       
                                                                
 1     thin      A    gromac  R   1-00:11:19  10    foner[132-141] 
                                                                
 2     thin      A    gromac  R   21:33:49    10    foner[122-131] 
                                                                
 3     thin      A    gromac  R   13:31:49    10    foner[112-121] 
                                                                
 4     thin      B    DART_c  PD  00:00:00    30    waiting...     
                                                                
 5     thin      A    gromac  PD  00:00:00    10    waiting...     
Theorically and due to backfill , when user A finishes any of his running jobs
(1,2 or 3), although job 4 does not fit in the cluster the schedule should not
put job 5 to run. The reason is that job 5 it has less priority than job 4, and
backfill does not alter the time of jobs with more prioirty. It should wait
until other A's jobs finish and then put job 4 to run. 

Well, this does not happen. As user A is submitting jobs all the time, they're
filling all holes that user A's jobs are leaving, because job 4 doesn't fit (it
needs 30 nodes, not 10). Then, job 4 will never start until user A stops sending
jobs.

I have tried it in a test environment using sleeps. I have realized that I get
the same behavior when submitting jobs with more slurm max time (--time) than
the duration of the command (sleep time). Also, I have tried to adjust
parameters like bf_window, that is set to one day by default, without luck.

Does anybody knows why does this happen? Why in this case the backfill principle
of not altering jobs with more priority does not apply? Is there a way to solve
this?

Thanks,
Joan

Attaching slurm.conf and the output of squeue:

squeue --start
 JOBID PARTITION NAME     USER ST START_TIME          NODES NODELIST(REASON) 
                                                                          
 5     thin      gromacs_ A    PD 2014-11-06T12:06:19 10    (Priority)       
                                                                          
 4     thin      DART_cyc B    PD 2014-11-06T22:45:49 30    (Resources)      
In fact, job 4's start_time has been changing all the time when user A's jobs
get running. Maybe backfill can't calculate start_time accuratelly?
One thing you might need to look at is the value of the scheduler
parameter 'bf_window'.  The default value is 1440 minutes (1 day) but it
should probably be as large as your tMaxTime, i.e.

SchedulerParameters=bf_window=4320

See 'man slurm.conf' for more details.

Cheers,

Loris
 -- 
   
     Joan Francesc Arbona
     Ext. 2582
     Centre de Tecnologies de la Informació
     Universitat de les Illes Balears
     
     http://jfdeu.wordpress.com
     https://mallorca.guifi.net

Reply via email to