So I have a job that seems to constantly go into deferred state. I do releasehold and then run checkjob -v
------------------------------------------------------- State: Idle Creds: user:abpenny group:hamming class:long qos:DEFAULT WallTime: 00:00:00 of INFINITY SubmitTime: Mon Jan 31 11:17:11 (Time Queued Total: 1:00:25:05 Eligible: 00:00:00) StartDate: -00:36:15 Tue Feb 1 11:06:01 Total Tasks: 128 Req[0] TaskCount: 128 Partition: ALL Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [compute] Exec: '' ExecSize: 0 ImageSize: 0 Dedicated Resources Per Task: PROCS: 1 NodeAccess: SHARED TasksPerNode: 8 NodeCount: 16 IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 0 PartitionMask: [ALL] SystemQueueTime: Tue Feb 1 11:06:00 Flags: RESTARTABLE Messages: cannot create reservation for job '9847' (intital reservation attempt) PE: 128.00 StartPriority: 36 job can run in partition DEFAULT (648 procs available. 128 procs required) ------------------------------------------------------- So it says the job can run in the DEFAULT partition... But it gets immediately put in deferred state. I suspect it is because of the 'cannot create reservation'. How do I troubleshoot why it cannot? I can run qrun and it will go ahead and start. The nodes are available, the resources are there, there are no restrictions on the queue to stop it from running.... Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
