[slurm-dev] Re: Long delay starting slurmdbd after upgrade to 17.02

Ole Holm Nielsen Tue, 20 Jun 2017 07:46:31 -0700


On 06/20/2017 04:32 PM, Loris Bennett wrote:

We do our upgrades while full production is up and running.  We just stop
the Slurm daemons, dump the database and copy the statesave directory
just in case.  We then do the update, and finally restart the Slurm
daemons.  We only lost jobs once during an upgrade back around 2.2.6 or
so, but that was due a rather brittle configuration provided by our
vendor (the statesave path contained the Slurm version), rather than
Slurm itself and was before we had acquired any Slurm expertise
ourselves.

1. When you refer to "daemons", do you mean slurmctld, slurmdbd as wellas slurmd on all compute nodes? AFAIK, the recommended procedureupgrading and restarting in this order: 1) slurmdbd, 2) slurmctld, 3)slurmd on nodes.


2. When you mention statesave, I suppose this is what you refer to:
# scontrol show config | grep -i statesave
StateSaveLocation       = /var/spool/slurmctld

Thanks,
Ole

[slurm-dev] Re: Long delay starting slurmdbd after upgrade to 17.02

Reply via email to