We do exactly that.  We use the CPUs as the consumable resource rather
than the GPUs for that reason.  We also limit memory use as needed.
You might want to see the configuration issues we ran into and solved
as recorded in the thread at the link below.

https://groups.google.com/forum/#!topic/slurm-devel/x6VaKfrdH5Y


On Tue, Mar 1, 2016 at 1:27 PM, John Desantis <[email protected]> wrote:
>
> Felix,
>
> Although I haven't run into a use-case like yours (yet), my initial
> thought was to use the flag "MaxCPUsPerNode" in your configuration:
>
> 'Maximum number of CPUs on any node available to all jobs from this
> partition.  This can be especially useful to schedule GPUs. For
> example  a  node can  be  associated  with  two Slurm partitions (e.g.
> "cpu" and "gpu") and the partition/queue "cpu" could be limited to
> only a subset of the node’s CPUs, insuring that one or more CPUs would
> be available to jobs in the "gpu" partition/queue.'
>
> HTH,
> John DeSantis
>
>
>
> 2016-03-01 9:05 GMT-05:00 Felix Willenborg 
> <[email protected]>:
>> Hey folks,
>>
>> I'm kind of new to SLURM and we're setting it up in our work group with our
>> nodes. Our cluster contains per node 2 GPUs and 12 CPU cores.
>>
>> The GPUs are configured with gres like this :
>> Name=gpu_mem Count=6143
>> Name=gpu File=/dev/nvidia0
>> Name=gpu File=/dev/nvidia1
>> #Name=bandwidth count=4G
>> (Somehow the bandwith plugin isn't available in the repository slurm and I'm
>> getting error messages with that. That's why it's commented out. Is it even
>> necessary?)
>>
>> The nodes are defined like that in the slurm.conf :
>> [...]
>> NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2
>> CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN
>> Gres=gpu:3,gpu_mem:12287#,bandwidth:4G
>>
>>
>> We'd like to have a situation where one CPU is always available for one GPU
>> and only can allocated with one GPU, because we often had the situation that
>> reservations were made where all CPUs were allocated and we couldn't use the
>> GPUs anymore. I searched on the internet and didn't find any similiar cases
>> which could help me. The only thing I found was adding "CPUS=0,1" at the end
>> of every Name=gpu ... in gres.conf. Would this already do it? And if not,
>> what can I do? I've got the feeling that I could solve my problem with SLURM
>> in many ways. We're using SLURM version 14.11.8.
>>
>> Looking forward to some answers!
>>
>> Best wishes,
>> Felix Willenborg



-- 
:-) Lachele
Lachele Foley
CCRC/UGA
Athens, GA USA

Reply via email to