Felix,

Although I haven't run into a use-case like yours (yet), my initial
thought was to use the flag "MaxCPUsPerNode" in your configuration:

'Maximum number of CPUs on any node available to all jobs from this
partition.  This can be especially useful to schedule GPUs. For
example  a  node can  be  associated  with  two Slurm partitions (e.g.
"cpu" and "gpu") and the partition/queue "cpu" could be limited to
only a subset of the node’s CPUs, insuring that one or more CPUs would
be available to jobs in the "gpu" partition/queue.'

HTH,
John DeSantis



2016-03-01 9:05 GMT-05:00 Felix Willenborg <[email protected]>:
> Hey folks,
>
> I'm kind of new to SLURM and we're setting it up in our work group with our
> nodes. Our cluster contains per node 2 GPUs and 12 CPU cores.
>
> The GPUs are configured with gres like this :
> Name=gpu_mem Count=6143
> Name=gpu File=/dev/nvidia0
> Name=gpu File=/dev/nvidia1
> #Name=bandwidth count=4G
> (Somehow the bandwith plugin isn't available in the repository slurm and I'm
> getting error messages with that. That's why it's commented out. Is it even
> necessary?)
>
> The nodes are defined like that in the slurm.conf :
> [...]
> NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2
> CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN
> Gres=gpu:3,gpu_mem:12287#,bandwidth:4G
>
>
> We'd like to have a situation where one CPU is always available for one GPU
> and only can allocated with one GPU, because we often had the situation that
> reservations were made where all CPUs were allocated and we couldn't use the
> GPUs anymore. I searched on the internet and didn't find any similiar cases
> which could help me. The only thing I found was adding "CPUS=0,1" at the end
> of every Name=gpu ... in gres.conf. Would this already do it? And if not,
> what can I do? I've got the feeling that I could solve my problem with SLURM
> in many ways. We're using SLURM version 14.11.8.
>
> Looking forward to some answers!
>
> Best wishes,
> Felix Willenborg

Reply via email to