Some of my dependent jobs are going into BatchHold and the dependencies are 
missing.

Here I submitted 3 echo sleep jobs and then one job dependent on the other 
three. The first 3 completed but the dependent job is hanging in BatchHold.
I have MinJobAge set to 300 - could this be the issue? Looks like the dependent 
job lost the dependencies... JobState=PENDING Reason=JobHeldAdmin 
Dependency=(null)


1800                C 0             slu   0.00      1.0  -     mhill        -   
        rrz011     8    00:01:41   Tue Mar 18 10:11:
57
1801                C 0             slu   0.00      1.0  -     mhill        -   
        rrz012     8    00:01:42   Tue Mar 18 10:12:
00
1802                C 0             slu   0.00      1.0  -     mhill        -   
        rrz013     8    00:01:42   Tue Mar 18 10:12:
02

These are the jobs that 1803 is dependent on...
[root@rrz-master ~]# scontrol show job 1800
slurm_load_jobs error: Invalid job id specified
[root@rrz-master ~]# scontrol show job 1801
slurm_load_jobs error: Invalid job id specified
[root@rrz-master ~]# scontrol show job 1802
slurm_load_jobs error: Invalid job id specified

This is the dependent job.
[root@rrz-master ~]# scontrol show job 1803
JobId=1803 Name=moab.job.FG0mBi
   UserId=mhill(24177) GroupId=mhill(24177)
   Priority=0 Account=(null) QOS=(null)
   JobState=PENDING Reason=JobHeldAdmin Dependency=(null)
   Requeue=0 Restarts=0 BatchFlag=1 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=04:00:00 TimeMin=N/A
   SubmitTime=2014-03-18T10:10:48 EligibleTime=Unknown
   StartTime=Unknown EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=standard AllocNode:Sid=rrz-master:11210
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1-1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:*
   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/opt/MOAB/spool/moab.job.FG0mBi
   WorkDir=/turquoise/users/mhill
   Comment='NACCESSPOLICY=SINGLEJOB??SJID:1494?SID:moab'


<<inline: image002.jpg>>

Reply via email to