[slurm-dev] Re: starting slurmd only after GPUs are fully initialized

Antony Cleave Mon, 01 Sep 2014 08:49:35 -0700


I think modifying the init scripts is likely to be the only way:

When I built my own version of slurm 14.03 of ubuntu 10.04 I installedboth slurm and munge on an nfs filesystem to be sure that slurm.conf wasidentical across the cluster. This meant that the default init.d scriptswould fail as it would always try to start before the/store/cluster/apps filesystem had been mounted. The way I fixed thiswas to create an upstart script for munge (which I then use to triggerslurm) which was started by the "remote-filesystems" event AND poll tosee if the directory existed yet and only start munge when it was avalid path. You can do exactly the same to test for /dev/nvidia0.

since all of the polling is done in my munge upstart script and not inslurm here is my /init/munge.conf.Please not that this is the first ever upstart script I ever wrote soI'm not claiming this is the best way, only that it works, I've not evengone back and cleaned it up


--------------------------------------------------------------------------------
# Munge (My custom build)
#

description "Munge (My custom build for slurm)"

start on remote-filesystems
stop on runlevel [06S]

respawn

pre-start script
  prefix="/store/cluster/apps/munge/gcc"
  exec_prefix="${prefix}"
  sbindir="${exec_prefix}/sbin"
  sysconfdir="${prefix}/etc"
  localstatedir="${prefix}/var"
  DAEMON="$sbindir/munged"
  RETRYCOUNT=10
  RETRYDELAY=10
  mycount=0

  logger -is -t "$UPSTART_JOB" "checking prefix ${prefix}"
  mkdir -p /var/run/munge
  for dir in /home/share /store/cluster/apps/munge ; do
    logger -is -t "$UPSTART_JOB" "checking dir \"$dir\" exists "

logger -is -t "$UPSTART_JOB" "RETRYCOUNT=$RETRYCOUNT andmycount=$mycount"

    while [ $mycount -lt ${RETRYCOUNT} ] ; do
    mycount=`expr $mycount + 1`
      if [ -d "$dir" ]
      then
        logger -is -t "$UPSTART_JOB" "$dir exists! lets go!"
        break;
      else

logger -is -t "$UPSTART_JOB" "WARNING: Required remote DIR\"$dir\" not yet mounted waiting ${RETRYDELAY} seconds to retry (attempt${mycount} of ${RETRYCOUNT} )"

        sleep $RETRYDELAY
      fi
    done
    if [ $mycount -eq 5 ]
    then
      logger -is -t "$UPSTART_JOB" "$dir does not exist giving up!"
      stop
    fi
  done
#  exit 0
end script

expect daemon
exec /store/cluster/apps/munge/gcc/sbin/munged 2>&1
--------------------------------------------------------------------------------------------------------------

I start slurm one munged has started using the "start on started munge"upstart directive


Hopefully this is a useful example

Antony


On 31/08/2014 17:32, Lev Givon wrote:

Received from Andy Riebs on Fri, Aug 29, 2014 at 01:48:53PM EDT:

On 08/29/2014 12:13 PM, Lev Givon wrote:

I recently set up slurm 2.6.5 on a cluster of Ubuntu 14.04.1 systems hosting 
several
NVIDIA GPUs set up as generic resources. When the compute nodes are rebooted, I
noticed that they attempt to start slurmd before the device files initialized by
the nvidia kernel module appear, i.e., the following  message appears in syslog
some number of lines before the GPU kernel driver load messages.

slurmd[1453]: fatal: can't stat gres.conf file /dev/nvidia0: No such file or 
directory

Is there a recommended way (on Ubuntu, at least) to ensure that slurmd isn't
started before any GPU device files appear?

One way to work around this is to set the node definition(s) in
slurm.conf with "State=DOWN". That way, manual intervention will be
required when a node is rebooted, allowing the rest of the system to
finish coming up.

Not sure how the above suggestion remedies the problem; as things stand,
I already need to manually start slurmd on the compute nodes after a
reboot because the absence of the device files prevents the daemon from 
starting.

Perhaps I should have phrased my question differently: is there a
recommended method on Ubuntu for ensuring that slurmd starts only after the GPU
device files appear if a GPU generic resource has been defined in a node's SLURM
configuration? One possibility that I'll try if no other solutions present
themselves involves modifying the init.d startup script to poll for the device
files if a GPU resource exists, but I'm curious whether there are any existing
fixes given that SLURM packages for Ubuntu have already existed for several 
years.

[slurm-dev] Re: starting slurmd only after GPUs are fully initialized

Reply via email to