Loris Bennett <loris.benn...@fu-berlin.de> writes: > Hi, > > I have a node which is powered on and to which I have sent a job. The > output of sinfo is > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > test up 7-00:00:00 1 mix~ node001 > > The output of squeue is > > JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) > 1795993 test 7_single loris CF 24:29 1 node001 > > I don't understand the node state 'mix~'. If at all, I would only > expect it to exist very briefly between 'idle~' and 'mix#'. The '~' is > certainly incorrect, as the node is not in a power-saving state, which > in our case is powered-off. > > This problem may have existed in 16.05.10-2, but currently we are using > 17.02.7. All other nodes in the cluster apart from one are functioning > normally. > > Does anyone have any idea what we might be doing wrong?
I still don't know what the problem was, but I got the node back into a sensible state by setting the state to FAIL, rebooting the node, and then setting the state to RESUME. Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de