Hi, I have the same problem, i can't ask for one gpu: srun -p gpunpart --gres=gpu:1 --pty bash -i srun: error: Unable to allocate resources: Requested node configuration is not available
I configured gpu nodes in slurm.conf like that : ... *NodeName=nodgpu[01-05] Procs=24 CoresPerSocket=12 RealMemory=128000 Sockets=2 ThreadsPerCore=1 TmpDisk=703488 Gres=gpu:4 Feature=Haswell,Tesla,k40m* ... *GresTypes=Haswell,Tesla,Westmere,gpu,k40m* and *SelectType=select/cons_resSelectTypeParameters=CR_Socket_Memory*... the gres.conf file on the five nodes: *Name=gpu File=/dev/nvidia0 CPUs=0,2,4,6,8,10,12,14,16,18,20,22Name=gpu File=/dev/nvidia1 CPUs=1,3,5,7,9,11,13,15,17,19,21,23Name=gpu File=/dev/nvidia2 CPUs=0,2,4,6,8,10,12,14,16,18,20,22Name=gpu File=/dev/nvidia3 CPUs=1,3,5,7,9,11,13,15,17,19,21,23Name=mic Count=0* The cgroup.conf on each node: *CgroupMountpoint="/sys/fs/cgroup"CgroupAutomount=yesCgroupReleaseAgentDir="/etc/slurm/cgroup"ConstrainRAMSpace=yesAllowedRAMSpace=100ConstrainCores=yesTaskAffinity=no* we use slurm/14.11.11 I don't know what'is the problem Any idea ? Thank you in advance Red 2016-03-03 23:14 GMT+01:00 Lachele Foley <[email protected]>: > > We aren't doing that. I agree we probably should. If you work out a > config, and don't mind doing so, please share it. > > > On Thu, Mar 3, 2016 at 3:11 AM, Daniel Letai <[email protected]> wrote: > > Correct me if I'm wrong, but I don't see any NUMA based reservation of > the > > CPUs - Do you ensure that each reserved cpu is from a different socket, > and > > GPU jobs affinity is to correct NUMA node? > > > > > > On 03/02/2016 12:30 AM, Lachele Foley wrote: > > > > We do exactly that. We use the CPUs as the consumable resource rather > > than the GPUs for that reason. We also limit memory use as needed. > > You might want to see the configuration issues we ran into and solved > > as recorded in the thread at the link below. > > > > https://groups.google.com/forum/#!topic/slurm-devel/x6VaKfrdH5Y > > > > > > On Tue, Mar 1, 2016 at 1:27 PM, John Desantis <[email protected]> > wrote: > > > > Felix, > > > > Although I haven't run into a use-case like yours (yet), my initial > > thought was to use the flag "MaxCPUsPerNode" in your configuration: > > > > 'Maximum number of CPUs on any node available to all jobs from this > > partition. This can be especially useful to schedule GPUs. For > > example a node can be associated with two Slurm partitions (e.g. > > "cpu" and "gpu") and the partition/queue "cpu" could be limited to > > only a subset of the node’s CPUs, insuring that one or more CPUs would > > be available to jobs in the "gpu" partition/queue.' > > > > HTH, > > John DeSantis > > > > > > > > 2016-03-01 9:05 GMT-05:00 Felix Willenborg > > <[email protected]>: > > > > Hey folks, > > > > I'm kind of new to SLURM and we're setting it up in our work group with > our > > nodes. Our cluster contains per node 2 GPUs and 12 CPU cores. > > > > The GPUs are configured with gres like this : > > Name=gpu_mem Count=6143 > > Name=gpu File=/dev/nvidia0 > > Name=gpu File=/dev/nvidia1 > > #Name=bandwidth count=4G > > (Somehow the bandwith plugin isn't available in the repository slurm and > I'm > > getting error messages with that. That's why it's commented out. Is it > even > > necessary?) > > > > The nodes are defined like that in the slurm.conf : > > [...] > > NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2 > > CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN > > Gres=gpu:3,gpu_mem:12287#,bandwidth:4G > > > > > > We'd like to have a situation where one CPU is always available for one > GPU > > and only can allocated with one GPU, because we often had the situation > that > > reservations were made where all CPUs were allocated and we couldn't use > the > > GPUs anymore. I searched on the internet and didn't find any similiar > cases > > which could help me. The only thing I found was adding "CPUS=0,1" at the > end > > of every Name=gpu ... in gres.conf. Would this already do it? And if not, > > what can I do? I've got the feeling that I could solve my problem with > SLURM > > in many ways. We're using SLURM version 14.11.8. > > > > Looking forward to some answers! > > > > Best wishes, > > Felix Willenborg > > > > > > > > > > -- > :-) Lachele > Lachele Foley > CCRC/UGA > Athens, GA USA
