Hi everyone. This is the second time I noticed this problem.

If one user submits several jobs requesting a specific node which is currently 
not available due to resources, then all following jobs will stay in pending 
state with Reason=Priority, despite other nodes sitting idle.

Example:

  JOBID PARTI NAME                     USER ST       TIME PRIOR NODELIST COMMENT
  70148 batch 1G_6_test             dtaliun PD       0:00   364         (null)
  70128 batch 1G_8                  dtaliun PD       0:00   365         (null)
  70127 batch 1G_6                  dtaliun PD       0:00   365         (null)
  70126 batch 1G_5                  dtaliun PD       0:00   365         (null)
  70125 batch 1G_4                  dtaliun PD       0:00   365         (null)
  70124 batch 1G_3                  dtaliun PD       0:00   365         (null)
  70123 batch 1G_2                  dtaliun PD       0:00   365         (null)
  70122 batch 1G_1                  dtaliun PD       0:00   365         (null)
  70096 batch bayesian_ci_G30haplo  dtaliun PD       0:00   386         (null)
  70095 batch bayesian_ci_G30haplo  dtaliun PD       0:00   386         (null)
  69643 batch zapata_ci_G5haplo     dtaliun  R 9-23:24:40   333  calc06 (null)

Job 69643 is running on calc06.
Jobs 70095-70128 have ReqNodeList=calc06 and are all in state PD, 
Reason=Resources (correct).

Job 70148 though could start on any other node, but it doesn't:

JobId=70148 Name=1G_6_test
   UserId=dtaliun(1026) GroupId=dtaliun(1026)
   Priority=364 Account=stats QOS=normal
   JobState=PENDING Reason=Priority Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=14-00:00:00 TimeMin=N/A
   SubmitTime=2013-02-15T09:19:20 EligibleTime=2013-02-15T09:19:20
   StartTime=2013-02-17T14:31:51 EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=batch AllocNode:Sid=calc05:14709
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:*
   MinCPUsNode=1 MinMemoryCPU=5000M MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=(null)
   WorkDir=/test

If I raise the priority of job 70148 manually then I can make the job start, 
but the logic looks broken. With priority/multifactor at play, what happens is 
that a single user can block the whole cluster by just scheluding some jobs 
which are waiting on any resource.

I'm running with the builtin scheduler with priority/multifactor:

SchedulerType           = sched/builtin
SelectType              = select/cons_res
PriorityType            = priority/multifactor

under SLURM 2.4.4. I've been looking at the changelog, but it doesn't look like 
anything changed for the builtin scheduler in later versions. Can anybody 
confirm the problem and/or knows if it has been fixed recently?

Thanks.

Reply via email to