Your compute node has fewer CPUs than configured
in slurm.conf. If you execute /usr/sbin/slurmd -C
on a compute node, it will tell you exactly how
many sockets, cores, thread, memory and temporary
disk space are found on the node. If the slurm.conf
file has higher values configured, then node will
be marked DOWN as you see. You do not need to
configure sockets, cores and threads, but only
CPU (Procs=) if you are only worried about allocating
CPUs and not about the task topology.

Quoting Fred Liu <[email protected]>:

Hi,

What does "Low CPUS" mean?
How can I make my node not in DOWN stat?

scontrol show node cnlnx03
NodeName=cnlnx03 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUErr=0 CPUTot=2 Features=(null)
   Gres=(null)
   OS=Linux RealMemory=1 Sockets=2
   State=DOWN ThreadsPerCore=1 TmpDisk=0 Weight=1
   BootTime=2010-12-27T09:33:49 SlurmdStartTime=2011-03-15T13:54:59
   Reason=Low CPUs [slurm@2011-03-15T13:41:38]

Thanks.

Fred





Reply via email to