Hi Moe, Thank you for the patch. I tried it yesterday and now it works fine.
Regards, Carles Fenoy On Wed, Jul 6, 2011 at 11:21 PM, <[email protected]> wrote: > Carles, > > The logic to support managing generic resource topology (associating > specific generic resources with specific CPUs on a node) was incomplete. The > attached patch should fix the problem you have reported and will be included > in SLURM version 2.3.0-pre7. > > Moe Jette > SchedMD LLC > > > On 07/01/2011 06:08 PM, Carles Fenoy wrote: >> >>> ---------- Mensaje reenviado ---------- >>> De: <[email protected] <mailto:[email protected]>> >>> >>> Fecha: 01/07/2011 17:24 >>> Asunto: Fwd: Re: [slurm-dev] GRES Overallocating resources >>> Para: "Carles Fenoy" <[email protected] <mailto:[email protected]>> >>> >>> >>> Hi Carles, >>> >>> I have been able to reproduce this problems. If I include the "CPUs" >>> field in the gres.conf file this problem occurs. It does not occur >>> without the "CPUs" field. What are the chances of getting a SLURM >>> support contract to fix this for you? >>> >>> Moe Jette >>> SchedMD LLC >>> >>> >>> ----- Forwarded message from [email protected] >>> <mailto:[email protected]> ----- >>> >>> Date: Fri, 1 Jul 2011 08:37:21 +0200 >>> From: Carles Fenoy <[email protected] <mailto:[email protected]>> >>> Reply-To: Carles Fenoy <[email protected] <mailto:[email protected]>> >>> >>> Subject: Re: [slurm-dev] GRES Overallocating resources >>> To: [email protected] <mailto:[email protected]> >>> Cc: [email protected] >>> <mailto:[email protected].**gov<[email protected]> >>> > >>> >>> >>> >>> Hi Moe, >>> >>> Thanks for your quick reply. >>> I've modified the configurations parameters and it still behaves the same >>> way. I send the output of squeue, sinfo, scontrol show nodes and scontrol >>> show jobs >>> >>> sinfo: >>> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST >>> projects* up infinite 1 alloc bscop134 >>> >>> squeue >>> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) >>> 20214 projects sbatch cfenoy PD 0:00 1 (Resources) >>> 20210 projects sbatch cfenoy R 4:22 1 bscop134 >>> 20211 projects sbatch cfenoy R 4:21 1 bscop134 >>> 20212 projects sbatch cfenoy R 4:21 1 bscop134 >>> 20213 projects sbatch cfenoy R 4:20 1 bscop134 >>> >>> scontrol show nodes: >>> NodeName=bscop134 Arch=x86_64 CoresPerSocket=1 >>> CPUAlloc=4 CPUErr=0 CPUTot=8 Features=(null) >>> Gres=gpu:2 >>> NodeAddr=bscop134 NodeHostName=bscop134 >>> OS=Linux RealMemory=12036 Sockets=8 >>> State=MIXED ThreadsPerCore=1 TmpDisk=20157 Weight=1 >>> BootTime=2011-06-17T11:15:47 SlurmdStartTime=2011-07-01T08:**37:16 >>> Reason=(null) >>> >>> scontrol show jobs(only 3 jobs): >>> JobId=20212 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901757 Account=(null) QOS=(null) WCKey=* >>> JobState=RUNNING Reason=None Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:04:40 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:32 EligibleTime=2011-07-01T08:38:**32 >>> StartTime=2011-07-01T08:38:32 EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=bscop134 >>> BatchHost=bscop134 >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> JobId=20213 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901756 Account=(null) QOS=(null) WCKey=* >>> JobState=RUNNING Reason=None Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:04:39 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:33 EligibleTime=2011-07-01T08:38:**33 >>> StartTime=2011-07-01T08:38:33 EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=bscop134 >>> BatchHost=bscop134 >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> JobId=20214 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901755 Account=(null) QOS=(null) WCKey=* >>> JobState=PENDING Reason=Resources Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:33 EligibleTime=2011-07-01T08:38:**33 >>> StartTime=Unknown EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=(null) >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> >>> On Thu, Jun 30, 2011 at 7:08 PM, <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> It looks like there is a configuration problem. You have a gres >>> defined in >>> some places as "gpu" and in other places as "gpus", which will >>> result in two >>> separate sets of data structures. In slurm v2.3 I see that >>> configuration log >>> a bunch of errors. >>> >>> >>> Quoting Carles Fenoy <[email protected] <mailto:[email protected]>>: >>> >>> Hi all, >>> >>> >>> I've been testing last few days slurm with gres for our future >>> nvidia >>> machine, and I'm facing some problems with gres overallocating >>> resources. >>> I've seen the following error every time the controller starts >>> a job. >>> >>> [2011-06-28T14:50:55] error: gres/gpu: job 20206 node bscop134 >>> overallocated >>> resources by 2 >>> >>> The configuration consists of 1 node with 2 gpus. At the end >>> of the email >>> you can find the relevant configurations parameters. >>> >>> Is this the expected behavior of the scheduling with gres? Is >>> this a bug, >>> or >>> there is no way to no over-allocate resources? >>> >>> Best regards, >>> Carles Fenoy >>> >>> slurm.conf: >>> >>> SelectType=select/cons_res >>> >>> SelectTypeParameters=CR_CPU >>> >>> SchedulerType=sched/backfill >>> >>> GresTypes=gpu >>> >>> NodeName=DEFAULT RealMemory=12000 Procs=8 TmpDisk=20000 >>> Gres=gpus:2 >>> >>> NodeName=bscop134 NodeAddr=bscop134 Gres=gpus:2 >>> >>> PartitionName=projects AllowGroups=ALL Hidden=NO RootOnly=NO >>> MaxNodes=UNLIMITED MinNodes=1 MaxTime=UNLIMITED Shared=NO State=UP >>> Default=YES Nodes=bscop134 >>> >>> >>> gres.conf: >>> >>> Name=gpu File=/dev/nvidia0 CPUs=0-3 >>> Name=gpu File=/dev/nvidia1 CPUs=4-7 >>> >>> >>> -- >>> -- >>> Carles Fenoy >>> >>> >>> >>> >>> Moe Jette >>> SchedMD LLC >>> >>> >>> >>> >>> -- >>> -- >>> Carles Fenoy >>> >>> >>> ----- End forwarded message ----- >>> >>> >>> Moe Jette >>> SchedMD LLC >>> >>> Hi Moe, >>> >>> Thanks for your quick reply. >>> I've modified the configurations parameters and it still behaves the >>> same way. I send the output of squeue, sinfo, scontrol show nodes and >>> scontrol show jobs >>> >>> sinfo: >>> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST >>> projects* up infinite 1 alloc bscop134 >>> >>> squeue >>> JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) >>> 20214 projects sbatch cfenoy PD 0:00 1 (Resources) >>> 20210 projects sbatch cfenoy R 4:22 1 bscop134 >>> 20211 projects sbatch cfenoy R 4:21 1 bscop134 >>> 20212 projects sbatch cfenoy R 4:21 1 bscop134 >>> 20213 projects sbatch cfenoy R 4:20 1 bscop134 >>> >>> scontrol show nodes: >>> NodeName=bscop134 Arch=x86_64 CoresPerSocket=1 >>> CPUAlloc=4 CPUErr=0 CPUTot=8 Features=(null) >>> Gres=gpu:2 >>> NodeAddr=bscop134 NodeHostName=bscop134 >>> OS=Linux RealMemory=12036 Sockets=8 >>> State=MIXED ThreadsPerCore=1 TmpDisk=20157 Weight=1 >>> BootTime=2011-06-17T11:15:47 SlurmdStartTime=2011-07-01T08:**37:16 >>> Reason=(null) >>> >>> scontrol show jobs(only 3 jobs): >>> JobId=20212 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901757 Account=(null) QOS=(null) WCKey=* >>> JobState=RUNNING Reason=None Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:04:40 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:32 EligibleTime=2011-07-01T08:38:**32 >>> StartTime=2011-07-01T08:38:32 EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=bscop134 >>> BatchHost=bscop134 >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> JobId=20213 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901756 Account=(null) QOS=(null) WCKey=* >>> JobState=RUNNING Reason=None Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:04:39 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:33 EligibleTime=2011-07-01T08:38:**33 >>> StartTime=2011-07-01T08:38:33 EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=bscop134 >>> BatchHost=bscop134 >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> JobId=20214 Name=sbatch >>> UserId=cfenoy(1001) GroupId=users(100) >>> Priority=4294901755 Account=(null) QOS=(null) WCKey=* >>> JobState=PENDING Reason=Resources Dependency=(null) >>> Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0 >>> RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A >>> SubmitTime=2011-07-01T08:38:33 EligibleTime=2011-07-01T08:38:**33 >>> StartTime=Unknown EndTime=Unknown >>> PreemptTime=NO_VAL SuspendTime=None SecsPreSuspend=0 >>> Partition=projects AllocNode:Sid=bscop134:13583 >>> ReqNodeList=(null) ExcNodeList=(null) >>> NodeList=(null) >>> NumNodes=1 NumCPUs=1 CPUs/Task=1 ReqS:C:T=*:*:* >>> MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0 >>> Features=(null) Gres=gpu:2 Reservation=(null) >>> Shared=OK Contiguous=0 Licenses=(null) Network=(null) >>> Command=(null) >>> WorkDir=/home/cfenoy >>> >>> >>> On Thu, Jun 30, 2011 at 7:08 PM, <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> It looks like there is a configuration problem. You have a gres >>> defined in some places as "gpu" and in other places as "gpus", >>> which will result in two separate sets of data structures. In >>> slurm v2.3 I see that configuration log a bunch of errors. >>> >>> >>> Quoting Carles Fenoy <[email protected] <mailto:[email protected]>>: >>> >>> Hi all, >>> >>> I've been testing last few days slurm with gres for our future >>> nvidia >>> machine, and I'm facing some problems with gres overallocating >>> resources. >>> I've seen the following error every time the controller starts >>> a job. >>> >>> [2011-06-28T14:50:55] error: gres/gpu: job 20206 node bscop134 >>> overallocated >>> resources by 2 >>> >>> The configuration consists of 1 node with 2 gpus. At the end >>> of the email >>> you can find the relevant configurations parameters. >>> >>> Is this the expected behavior of the scheduling with gres? Is >>> this a bug, or >>> there is no way to no over-allocate resources? >>> >>> Best regards, >>> Carles Fenoy >>> >>> slurm.conf: >>> >>> SelectType=select/cons_res >>> >>> SelectTypeParameters=CR_CPU >>> >>> SchedulerType=sched/backfill >>> >>> GresTypes=gpu >>> >>> NodeName=DEFAULT RealMemory=12000 Procs=8 TmpDisk=20000 >>> Gres=gpus:2 >>> >>> NodeName=bscop134 NodeAddr=bscop134 Gres=gpus:2 >>> >>> PartitionName=projects AllowGroups=ALL Hidden=NO RootOnly=NO >>> MaxNodes=UNLIMITED MinNodes=1 MaxTime=UNLIMITED Shared=NO State=UP >>> Default=YES Nodes=bscop134 >>> >>> >>> gres.conf: >>> >>> Name=gpu File=/dev/nvidia0 CPUs=0-3 >>> Name=gpu File=/dev/nvidia1 CPUs=4-7 >>> >>> >>> -- >>> -- >>> Carles Fenoy >>> >> -- -- Carles Fenoy
