So I have a job that seems to constantly go into deferred state.

I do releasehold and then run checkjob -v

 

-------------------------------------------------------

State: Idle

Creds:  user:abpenny  group:hamming  class:long  qos:DEFAULT

WallTime: 00:00:00 of   INFINITY

SubmitTime: Mon Jan 31 11:17:11

  (Time Queued  Total: 1:00:25:05  Eligible: 00:00:00)

 

StartDate: -00:36:15  Tue Feb  1 11:06:01

Total Tasks: 128

 

Req[0]  TaskCount: 128  Partition: ALL

Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0

Opsys: [NONE]  Arch: [NONE]  Features: [compute]

Exec:  ''  ExecSize: 0  ImageSize: 0

Dedicated Resources Per Task: PROCS: 1

NodeAccess: SHARED

TasksPerNode: 8  NodeCount: 16

 

 

IWD: [NONE]  Executable:  [NONE]

Bypass: 0  StartCount: 0

PartitionMask: [ALL]

SystemQueueTime: Tue Feb  1 11:06:00

 

Flags:       RESTARTABLE

 

Messages:  cannot create reservation for job '9847' (intital reservation
attempt)

 

PE:  128.00  StartPriority:  36

job can run in partition DEFAULT (648 procs available.  128 procs
required)

-------------------------------------------------------

 

So it says the job can run in the DEFAULT partition...

But it gets immediately put in deferred state. I suspect it is because
of the 'cannot create reservation'.

 

How do I troubleshoot why it cannot? I can run qrun and it will go ahead
and start. The nodes are available, the resources are there, there are
no restrictions on the queue to stop it from running....

 

Brian Andrus

ITACS/Research Computing

Naval Postgraduate School

Monterey, California

 

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to