Greetings!

I have a test slurm cluster installed running 14.11.4 on ScientificLinux
6.6.
I've been having some weird trouble, and I'm hoping for some assistance. I
hope this is the right forum.

Problem number 1:
If I issue 'scontrol reconfigure' from any node, the slurm controller
(slurmctld) dies and needs to be restarted (after which it runs fine).

Problem number 2:
'scontrol show jobs' shows jobs in state RUNNING that don't actually appear
to exist. Some of these are days old. What might be going on here?


-- 
Jon Nelson
Dyn / Senior Software Engineer
p. +1 (603) 263-8029

Reply via email to