On 07/29/2015 09:57 PM, Trevor Gale wrote:
I recently rebooted one of my nodes, and when it came back up slurm was running fine but when I run “sinfo” I see that it’s state i set to down. When I run scontrol show node compute0 it says that the reason is “unexpectedly rebooted”.
I have the same problem. According to the slurm.conf man page (Slurm 14.11.8), when I reboot a node using 'scontrol reboot_nodes <node>', it should be returned to normal use, but instead it stays down (Reason=Node unexpectedly rebooted)?
I should not have to set ReturnToService=2 for this, right? Thanks, Robbert -- Robbert Eggermont Intelligent Systems [email protected] Electr.Eng., Mathematics & Comp.Science +31 15 27 83234 Delft University of Technology
