For example you can try specifying CPUs with:
NodeName=node001,node002 Name=gpu File=/dev/nvidia0 CPUs=0-3
NodeName=node001,node002 Name=gpu File=/dev/nvidia1 CPUs=4-7
Cheers,
Barbara
On 03/01/2016 03:05 PM, Felix Willenborg wrote:
Hey folks,
I'm kind of new to SLURM and we're setting it up in our work group
with our nodes. Our cluster contains per node 2 GPUs and 12 CPU cores.
The GPUs are configured with gres like this :
/Name=gpu_mem Count=6143//
//Name=gpu File=/dev/nvidia0 //
//Name=gpu File=/dev/nvidia1 //
//#Name=bandwidth count=4G//
//(Somehow the bandwith plugin isn't available in the repository slurm
and I'm getting error messages with that. That's why it's commented
out. Is it even necessary?)
/The nodes are defined like that in the slurm.conf :
/[...]//
//NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2
CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN
Gres=gpu:3,gpu_mem:12287#,bandwidth:4G/
We'd like to have a situation where one CPU is always available for
one GPU and only can allocated with one GPU, because we often had the
situation that reservations were made where all CPUs were allocated
and we couldn't use the GPUs anymore. I searched on the internet and
didn't find any similiar cases which could help me. The only thing I
found was adding "CPUS=0,1" at the end of every Name=gpu ... in
gres.conf. Would this already do it? And if not, what can I do? I've
got the feeling that I could solve my problem with SLURM in many ways.
We're using SLURM version 14.11.8.
Looking forward to some answers!
Best wishes,
Felix Willenborg