Received from Andy Riebs on Fri, Aug 29, 2014 at 01:48:53PM EDT:
On 08/29/2014 12:13 PM, Lev Givon wrote:
I recently set up slurm 2.6.5 on a cluster of Ubuntu 14.04.1 systems
hosting several
NVIDIA GPUs set up as generic resources. When the compute nodes are
rebooted, I
noticed that
On 30/08/14 02:14, Lev Givon wrote:
Is there a recommended way (on Ubuntu, at least) to ensure that slurmd isn't
started before any GPU device files appear?
To be honest my policy has been for many years to never start queuing
system daemons on boot, it's too easy to have a node go bad,
One way to work around this is to set the node definition(s) in
slurm.conf with State=DOWN. That way, manual intervention will be
required when a node is rebooted, allowing the rest of the system to
finish coming up.
Andy
On 08/29/2014 12:13 PM, Lev Givon wrote:
I recently set up slurm