I'm setting up a new test cluster with SLURM 15.08.5 (we run Torque/Maui on our production cluster). We have a SLURM master server running CentOS 7.2 and two compute nodes on separate subnets (10.1.. and 10.2..). I'm writing a SLURM installation HowTo page as I go along:
https://wiki.fysik.dtu.dk/niflheim/SLURM

I'm now facing a problem running a trivial test:
# srun -N1 --constraint="opteron4" /bin/hostname
srun: error: Unable to allocate resources: Requested node configuration is not available

Question: What may be causing the available node with property "opteron4" to reject jobs?


The other partition works just fine:
# srun -N1 --constraint="xeon8" /bin/hostname
a012.dcsc.fysik.dtu.dk

FYI, the node status is:

# scontrol show nodes
NodeName=a012 Arch=x86_64 CoresPerSocket=4
CPUAlloc=8 CPUErr=0 CPUTot=8 CPULoad=0.01 Features=xeon5570,hp5412e,ethernet,xeon8
   Gres=(null)
   NodeAddr=a012 NodeHostName=a012 Version=15.08
   OS=Linux RealMemory=23900 AllocMem=0 FreeMem=22859 Sockets=2 Boards=1
   State=IDLE+COMPLETING ThreadsPerCore=1 TmpDisk=32752 Weight=1 Owner=N/A
   BootTime=2015-09-08T16:25:29 SlurmdStartTime=2015-12-16T15:29:32
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


NodeName=q007 Arch=x86_64 CoresPerSocket=2
CPUAlloc=0 CPUErr=0 CPUTot=4 CPULoad=0.01 Features=opteron2218,hp5412b,ethernet,opteron4
   Gres=(null)
   NodeAddr=q007 NodeHostName=q007 Version=15.08
   OS=Linux RealMemory=7820 AllocMem=0 FreeMem=7584 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=32752 Weight=1 Owner=N/A
   BootTime=2015-12-17T08:40:49 SlurmdStartTime=2015-12-17T08:41:03
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

I believe that the nodes are configured identically, except for their hardware differences.

--
Ole Holm Nielsen
Department of Physics, Technical University of Denmark

Reply via email to