slurmctld is the management process and since your have access to squeue/sinfo information it is running just fine. You need to check if slurmd (which is the agent part) is running on your nodes, i.e. VM-[669-671]
-- Nikita Burtsev On Wednesday, August 21, 2013 at 8:13 PM, Sivasangari Nandy wrote: > I have tried : > > /etc/init.d/slurm-llnl start > > [ ok ] Starting slurm central management daemon: slurmctld. > /usr/sbin/slurmctld already running. > > And : > > scontrol show slurmd > > scontrol: error: slurm_slurmd_info: Connection refused > slurm_load_slurmd_status: Connection refused > > > Hum how to proceed to repair that problem ? > > > > De: "Danny Auble" <[email protected] (mailto:[email protected])> > > À: "slurm-dev" <[email protected] (mailto:[email protected])> > > Envoyé: Mercredi 21 Août 2013 15:36:53 > > Objet: [slurm-dev] Re: Required node not available (down or drained) > > > > Check your slurmd log. It doesn't appear the slurmd is running. > > > > Sivasangari Nandy <[email protected] > > (mailto:[email protected])> wrote: > > > > > Hello, > > > > > > > > > > I'm trying to use Slurm for the first time, and I got a problem with > > > > > nodes I think. > > > > > I have this message when I used squeue : > > > > > > > > > > root@VM-667:~# squeue > > > > > JOBID PARTITION NAME USER ST TIME NODES > > > > > NODELIST(REASON) > > > > > 50 SLURM-deb test.sh (http://test.sh) root PD ; 0:00 > > > > > 1 (ReqNodeNotAvail) > > > > > > > > > > or this one with an other squeue : > > > > > > > > > > root@VM-671:~# squeue > > > > > JOBID PARTITION NAME USER ST TIME NODES > > > > > NODELIST(REASON) > > > > > 50 SLURM-deb test.sh (http://test.sh) root PD 0:00 > > > > > &n bsp; 1 (Resources) > > > > > > > > > > sinfo gives me : > > > > > > > > > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > > > > > SLURM-de* up infinite 3 down VM-[669-671] > > > > > > > > > > > > > > > I have already used slurm one time with the same configuration and I > > > > > wan able to run my job. > > > > > But now the second time I always got : > > > > > > > > > > srun: Required node not available (down or drained) > > > > > srun: job 51 queued and waiting for resources > > > > > > > > > > > > > > > Advance thanks for your help, > > > > > Siva > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > Sivasangari NANDY - Plate-forme GenOuest > IRISA-INRIA, Campus de Beaulieu > 263 Avenue du Général Leclerc > 35042 Rennes cedex, France > Tél: +33 (0) 2 99 84 25 69 > Bureau : D152 >
