Hi,

I have a job shown as running by 'squeue':

$ squeue -w node086
             JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
           1234567      main   abcdef user1234  R 10-09:32:34      1 node086

However with 'sinfo' I can see that the node has been powered off:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
test         up    3:00:00      2  idle~ node[001-002]
main*        up 14-00:00:0      1  idle~ node086
...

This is the second time I have seen this phenomenon since updating to
version 15.08.8 a month ago.

Is this a bug or can this just happen if a job just crashes in an odd
enough way?

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to