On 25 October 2016 at 09:17, Tuo Chen Peng <tp...@nvidia.com> wrote:

> Oh ok thanks for pointing this out.
>
> I thought ‘scontrol update’ command is for letting slurmctld to pick up
> any change in slurm.conf.
>
> But after reading the manual again, it seems this command is instead to
> change the setting at runtime, instead of reading any change from
> slurm.conf.
>
>
>
> So is restarting slurmctld the only way to let it pick up changes in
> slurm.conf?
>
> And if I change (2.2) in my plan to
>
> (2.2) restart slurmctld to pick changes in slurm.conf, then use ‘scontrol
> reconfigure’ to push changes to all nodes
>
> Do you see any impact to the running jobs in the cluster?
>
>
There shouldn't be any impact on running jobs at all, but of course there
are always caveats:
 - while slurmctld is restarting, no one will be able to send in any jobs
(although it should take ~5 seconds to restart unless you have made an
error, in which case it will take probably 1 minute to restart while you
fix and/or roll back, so no one should even notice)
 - as an extension of the above, if any of the jobs on the queue has a
running job as a dependency, and that job finishes in the x seconds that
slurmctld is down...but I doubt it.
 -  I can't remember exactly what they do, but if you look over the list, I
think some people save the contents of /var/spool/slurmd (which I believe
holds the "state" information of all running jobs)

(note that none of these is a real concern, they are just possible)


L.


------
The most dangerous phrase in the language is, "We've always done it this
way."
- Grace Hopper

Reply via email to