2012/11/6 Moe Jette <je...@schedmd.com>

> It works for me with slurm v2.3.5 and I don't recall any changes in this
> area
>
> $ srun -V
> slurm 2.3.5
> $ srun -C Intel -N1 n6 hostname
> tux1
> tux1
> tux1
> tux1
> tux1
> tux1
> $ srun -C Intel -N1 n8 hostname
> srun: error: Unable to allocate resources: Requested node configuration is
> not available

OK, I have been checking this a week ago and have forgotten nuance.

If you try:
srun --pty -n 14 -C IntelX5660  bash -i

job looks like:
[root@sqot sbin]# scontrol show job 139948
JobId=139948 Name=bash
   UserId=root(0) GroupId=root(0)
   Priority=100 Account=root QOS=normal
   JobState=PENDING Reason=Resources Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=3-00:00:00 TimeMin=N/A
   SubmitTime=2012-11-06T20:20:29 EligibleTime=2012-11-06T20:20:29
   StartTime=Unknown EndTime=Unknown
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=plgrid AllocNode:Sid=sqot:630
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1 NumCPUs=14 CPUs/Task=1 ReqS:C:T=*:*:*
   MinCPUsNode=1 MinMemoryCPU=1800M MinTmpDiskNode=0
   Features=IntelX5660 Gres=(null) Reservation=(null)
   Shared=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/bin/bash
   WorkDir=/root


as you can see it's looking for one node with 14 CPUs, but this can not be
satisfied with IntelX5660.

However, when I try running the same command one more time It's allocating
3 nodes and job is running immediatelly.

I can't fully understand what's going on. Is this possible that NumNodes is
changed during scheduling?

cheers
marcin


>
>
> Quoting Marcin Stolarek <stolarek.mar...@gmail.com>:
>
>  Hi,
>>
>> I'm using slurm-2.3.3 and I've noticed a problem.
>>
>> If user specify for example: srun -N 1 -n 64 -C Intel hostname job is
>> allowed to get into queue however node with 64 cores and Intel feature is
>> not available. I  checked in source and it looks like allocation is
>> created
>> without checking constraints, so in my case it is allowed to be queued (we
>> have 64core nodes) but is not possible to run because node with 64 cores
>> doesn't have Intel feature.
>>
>> regards,
>> marcin
>>
>>
>
>
>

Reply via email to