And the log file is not informative 

tail -f /var/log/slurm-llnl/slurmd.log 

... 
[2013-08-26T11:52:16] Slurmd shutdown completing 
[2013-08-26T11:52:56] slurmd version 2.3.4 started 
[2013-08-26T11:52:56] slurmd started on Mon 26 Aug 2013 11:52:56 +0200 
[2013-08-26T11:52:56] Procs=1 Sockets=1 Cores=1 Threads=1 Memory=2012 
TmpDisk=9069 Uptime=1122626 

----- Mail original -----

> De: "Sivasangari Nandy" <[email protected]>
> À: "slurm-dev" <[email protected]>
> Envoyé: Lundi 26 Août 2013 14:28:28
> Objet: Re: [slurm-dev] Re: Required node not available (down or
> drained)

> Hi,

> I have checked some things, now my slurmctld and slurmd are in a
> single machine (using just one node) so the test is easier.
> For that I have modified the conf file : vi
> /etc/slurm-llnl/slurm.conf

> Slurmctld and slurmd are both running, here my ps result :

> root@VM-667:/etc/slurm-llnl# ps -ef | grep slurm
> root 31712 31706 0 11:44 pts/1 00:00:00 tail -f
> /var/log/slurm-llnl/slurmd.log
> slurm 31990 1 0 11:52 ? 00:00:00 /usr/sbin/slurmctld
> root 32103 1 0 11:52 ? 00:00:00 /usr/sbin/slurmd -c
> root 32125 30346 0 11:53 pts/0 00:00:00 grep slurm

> So i have tried srun again but got this error yet:

> !srun
> srun /omaha-beach/test.sh
> srun: Required node not available (down or drained)
> srun: job 64 queued and waiting for resources

> Have you got any idea of the problem ?
> thanks,

> Siva

> ----- Mail original -----

> > De: "Nikita Burtsev" <[email protected]>
> 
> > À: "slurm-dev" <[email protected]>
> 
> > Envoyé: Jeudi 22 Août 2013 09:59:52
> 
> > Objet: [slurm-dev] Re: Required node not available (down or
> > drained)
> 

> > Re: [slurm-dev] Re: Required node not available (down or drained)
> 
> > You need to have slurmd running on all nodes that will execute
> > jobs,
> > so you should start it with init script.
> 

> > --
> 
> > Nikita Burtsev
> 
> > Sent with Sparrow
> 

> > On Thursday, August 22, 2013 at 11:55 AM, Sivasangari Nandy wrote:
> 
> > > " check if the slurmd daemon is running with the command " ps -el
> > > |
> > > grep slurmd ". "
> > 
> 

> > > Nothing is happened with ps -el ...
> > 
> 

> > > root@VM-667:~# ps -el | grep slurmd
> > 
> 

> > > > De: "Nikita Burtsev" < [email protected] >
> > > 
> > 
> 
> > > > À: "slurm-dev" < [email protected] >
> > > 
> > 
> 
> > > > Envoyé: Mercredi 21 Août 2013 18:58:52
> > > 
> > 
> 
> > > > Objet: [slurm-dev] Re: Required node not available (down or
> > > > drained)
> > > 
> > 
> 

> > > > Re: [slurm-dev] Re: Required node not available (down or
> > > > drained)
> > > 
> > 
> 
> > > > slurmctld is the management process and since your have access
> > > > to
> > > > squeue/sinfo information it is running just fine. You need to
> > > > check
> > > > if slurmd (which is the agent part) is running on your nodes,
> > > > i.e.
> > > > VM-[669-671]
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 
> > > > Nikita Burtsev
> > > 
> > 
> 

> > > > On Wednesday, August 21, 2013 at 8:13 PM, Sivasangari Nandy
> > > > wrote:
> > > 
> > 
> 
> > > > > I have tried :
> > > > 
> > > 
> > 
> 

> > > > > /etc/init.d/slurm-llnl start
> > > > 
> > > 
> > 
> 

> > > > > [ ok ] Starting slurm central management daemon: slurmctld.
> > > > 
> > > 
> > 
> 
> > > > > /usr/sbin/slurmctld already running.
> > > > 
> > > 
> > 
> 

> > > > > And :
> > > > 
> > > 
> > 
> 

> > > > > scontrol show slurmd
> > > > 
> > > 
> > 
> 

> > > > > scontrol: error: slurm_slurmd_info: Connection refused
> > > > 
> > > 
> > 
> 
> > > > > slurm_load_slurmd_status: Connection refused
> > > > 
> > > 
> > 
> 

> > > > > Hum how to proceed to repair that problem ?
> > > > 
> > > 
> > 
> 

> > > > > > De: "Danny Auble" < [email protected] >
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > À: "slurm-dev" < [email protected] >
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Envoyé: Mercredi 21 Août 2013 15:36:53
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Objet: [slurm-dev] Re: Required node not available (down or
> > > > > > drained)
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Check your slurmd log. It doesn't appear the slurmd is
> > > > > > running.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Sivasangari Nandy < [email protected] > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Hello,
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > I'm trying to use Slurm for the first time, and I got
> > > > > > > > > a
> > > > > > > > > problem
> > > > > > > > > with
> > > > > > > > > nodes I think.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > I have this message when I used squeue :
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > root@VM-667:~# squeue
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > JOBID PARTITION NAME USER ST TIME NODES
> > > > > > > > > NODELIST(REASON)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > 50 SLURM-deb test.sh root PD ; 0:00 1
> > > > > > > > > (ReqNodeNotAvail)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > or this one with an other squeue :
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > root@VM-671:~# squeue
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > JOBID PARTITION NAME USER ST TIME NODES
> > > > > > > > > NODELIST(REASON)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > 50 SLURM-deb test.sh root PD 0:00 &n bsp; 1
> > > > > > > > > (Resources)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > sinfo gives me :
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > SLURM-de* up infinite 3 down VM-[669-671]
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > I have already used slurm one time with the same
> > > > > > > > > configuration
> > > > > > > > > and
> > > > > > > > > I
> > > > > > > > > wan able to run my job.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > But now the second time I always got :
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > srun: Required node not available (down or drained)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > srun: job 51 queued and waiting for resources
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Advance thanks for your help,
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Siva
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > --
> > > > 
> > > 
> > 
> 

> > > > > Siva sangari NANDY - Plate-forme GenOuest
> > > > 
> > > 
> > 
> 
> > > > > IRISA-INRIA, Campus de Beaulieu
> > > > 
> > > 
> > 
> 
> > > > > 263 Avenue du Général Leclerc
> > > > 
> > > 
> > 
> 

> > > > > 35042 Rennes cedex, France
> > > > 
> > > 
> > 
> 
> > > > > Tél: +33 (0) 2 99 84 25 69
> > > > 
> > > 
> > 
> 

> > > > > Bureau : D152
> > > > 
> > > 
> > 
> 

> > > --
> > 
> 

> > > Siva sangari NANDY - Plate-forme GenOuest
> > 
> 
> > > IRISA-INRIA, Campus de Beaulieu
> > 
> 
> > > 263 Avenue du Général Leclerc
> > 
> 

> > > 35042 Rennes cedex, France
> > 
> 
> > > Tél: +33 (0) 2 99 84 25 69
> > 
> 

> > > Bureau : D152
> > 
> 

> --

> Siva sangari NANDY - Plate-forme GenOuest
> IRISA-INRIA, Campus de Beaulieu
> 263 Avenue du Général Leclerc

> 35042 Rennes cedex, France
> Tél: +33 (0) 2 99 84 25 69

> Bureau : D152

-- 

Siva sangari NANDY - Plate-forme GenOuest 
IRISA-INRIA, Campus de Beaulieu 
263 Avenue du Général Leclerc 

35042 Rennes cedex, France 
Tél: +33 (0) 2 99 84 25 69 

Bureau : D152 

Reply via email to