On 30/08/14 02:14, Lev Givon wrote: > Is there a recommended way (on Ubuntu, at least) to ensure that slurmd isn't > started before any GPU device files appear?
To be honest my policy has been for many years to never start queuing system daemons on boot, it's too easy to have a node go bad, reboot, come back up, take a job, go bad, reboot, take a job, go bad, reboot, repeat until no jobs left. DIMMs go bad, IB & accelerator cards go bad and cause NMIs, for us it's not worth the risk. We rarely reboot nodes other than hardware failure or for a software upgrade so if one does go bad we want to go and find out why before we let it back into the cluster. All the best, Chris -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci