Hi

I enabled power saving by setting SuspendTime in the slurm.conf file and
running scontrol reconf and slurm killed the jobs on the excluded nodes
by setting them to down.

[2014-02-21T14:42:11.284] Processing RPC: REQUEST_RECONFIGURE from uid=0
[2014-02-21T14:42:11.284] debug:  sched: begin reconfiguration
[2014-02-21T14:42:11.284] debug:  Reading slurm.conf file: /usr/etc/slurm.conf
[2014-02-21T14:42:11.314] debug:  No DownNodes
[2014-02-21T14:42:11.318] restoring original state of nodes
...
[2014-02-21T14:42:11.416] _slurm_rpc_reconfigure_controller: completed 
usec=131611
[2014-02-21T14:42:11.416] debug:  sched: Running job scheduler
[2014-02-21T14:42:11.435] debug:  power_save module, excluded nodes 
charlie[1-8],mds[1-2],nas[1-4],tape1
[2014-02-21T14:42:11.501] error: Setting node charlie1 state to DOWN
[2014-02-21T14:42:11.502] error: _slurm_rpc_node_registration node=charlie1: 
Invalid argument
[2014-02-21T14:42:11.502] error: Setting node charlie8 state to DOWN
[2014-02-21T14:42:11.502] requeue job 14614 due to failure of node charlie8
[2014-02-21T14:42:11.504] error: _slurm_rpc_node_registration node=charlie8: 
Invalid argument
[2014-02-21T14:42:11.536] error: Setting node charlie7 state to DOWN
...

Cheers,

Reply via email to