On 25 October 2016 at 09:17, Tuo Chen Peng <tp...@nvidia.com> wrote: > Oh ok thanks for pointing this out. > > I thought ‘scontrol update’ command is for letting slurmctld to pick up > any change in slurm.conf. > > But after reading the manual again, it seems this command is instead to > change the setting at runtime, instead of reading any change from > slurm.conf. > > > > So is restarting slurmctld the only way to let it pick up changes in > slurm.conf? > > And if I change (2.2) in my plan to > > (2.2) restart slurmctld to pick changes in slurm.conf, then use ‘scontrol > reconfigure’ to push changes to all nodes > > Do you see any impact to the running jobs in the cluster? > > There shouldn't be any impact on running jobs at all, but of course there are always caveats: - while slurmctld is restarting, no one will be able to send in any jobs (although it should take ~5 seconds to restart unless you have made an error, in which case it will take probably 1 minute to restart while you fix and/or roll back, so no one should even notice) - as an extension of the above, if any of the jobs on the queue has a running job as a dependency, and that job finishes in the x seconds that slurmctld is down...but I doubt it. - I can't remember exactly what they do, but if you look over the list, I think some people save the contents of /var/spool/slurmd (which I believe holds the "state" information of all running jobs)
(note that none of these is a real concern, they are just possible) L. ------ The most dangerous phrase in the language is, "We've always done it this way." - Grace Hopper