On 25/02/16 10:58, Berryhill, Jerome wrote:

> Thanks for the quick response. Quick follow-up; Is it possible to
> upgrade the slurmctld without taking down the cluster?

If you are quick enough - yes. :-)

>From the upgrade section I linked to before:

# Be mindful of your configured SlurmdTimeout and SlurmctldTimeout
# values. If the Slurm daemon's are down for longer than the
# specified timeout during an upgrade, nodes may be marked DOWN
# and their jobs killed. You can either increase the timeout
# values during an upgrade or insure that the slurmd daemons on
# compute nodes are not down for longer than SlurmdTimeout. 

We always install Slurm from source and we install into:

/usr/local/slurm/$version

and then have a symbolic link:

/usr/local/slurm/latest

which points to the current version we want to use.

We configure Slurm with:

./configure --prefix=/usr/local/slurm/${slurm_ver} 
--sysconfdir=/usr/local/slurm/etc

Which means slurmctld and slurmd look into /usr/local/slurm/etc
for their configuration files (rather than have them in the
version specific directory).

Then when we are doing an upgrade we can install in advance
and then switch the symlink over when we're ready to migrate.

All the best!
Chris
-- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

Reply via email to