Hey everyone,

sorry for digging out this old post but unfortunately I searched very
intensively and didn't find any solution.. therefore my suggestion would
be to add a flag for partitions like "MinCPUsPerGPUs" or something like
that. Would this be something which is useful? Maybe someone has a good
idea to solve my problem anyway?

Best,
Felix Willenborg

Am 07.03.2016 um 12:55 schrieb Felix Willenborg:
> Dear Lachele,
>
> your suggestion is great! It would work if we'd have a complete
> homogenic cluster - which is unfortunately not the case :(. All nodes
> have at least two graphic cards, one has 3, another has 4. Also one Node
> has a CPU with 16 cores. With MaxCPUsPerNode in slurm.conf for a
> partition I'd exclude hardware which would never be used. That would be
> very sad.
>
> Best,
> Felix
>
> On 01.03.2016 23:29, Lachele Foley wrote:
>> We do exactly that.  We use the CPUs as the consumable resource rather
>> than the GPUs for that reason.  We also limit memory use as needed.
>> You might want to see the configuration issues we ran into and solved
>> as recorded in the thread at the link below.
>>
>> https://groups.google.com/forum/#!topic/slurm-devel/x6VaKfrdH5Y
>>
>>
>> On Tue, Mar 1, 2016 at 1:27 PM, John Desantis <[email protected]> wrote:
>>> Felix,
>>>
>>> Although I haven't run into a use-case like yours (yet), my initial
>>> thought was to use the flag "MaxCPUsPerNode" in your configuration:
>>>
>>> 'Maximum number of CPUs on any node available to all jobs from this
>>> partition.  This can be especially useful to schedule GPUs. For
>>> example  a  node can  be  associated  with  two Slurm partitions (e.g.
>>> "cpu" and "gpu") and the partition/queue "cpu" could be limited to
>>> only a subset of the node’s CPUs, insuring that one or more CPUs would
>>> be available to jobs in the "gpu" partition/queue.'
>>>
>>> HTH,
>>> John DeSantis
>>>
>>>
>>>
>>> 2016-03-01 9:05 GMT-05:00 Felix Willenborg 
>>> <[email protected]>:
>>>> Hey folks,
>>>>
>>>> I'm kind of new to SLURM and we're setting it up in our work group with our
>>>> nodes. Our cluster contains per node 2 GPUs and 12 CPU cores.
>>>>
>>>> The GPUs are configured with gres like this :
>>>> Name=gpu_mem Count=6143
>>>> Name=gpu File=/dev/nvidia0
>>>> Name=gpu File=/dev/nvidia1
>>>> #Name=bandwidth count=4G
>>>> (Somehow the bandwith plugin isn't available in the repository slurm and 
>>>> I'm
>>>> getting error messages with that. That's why it's commented out. Is it even
>>>> necessary?)
>>>>
>>>> The nodes are defined like that in the slurm.conf :
>>>> [...]
>>>> NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2
>>>> CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN
>>>> Gres=gpu:3,gpu_mem:12287#,bandwidth:4G
>>>>
>>>>
>>>> We'd like to have a situation where one CPU is always available for one GPU
>>>> and only can allocated with one GPU, because we often had the situation 
>>>> that
>>>> reservations were made where all CPUs were allocated and we couldn't use 
>>>> the
>>>> GPUs anymore. I searched on the internet and didn't find any similiar cases
>>>> which could help me. The only thing I found was adding "CPUS=0,1" at the 
>>>> end
>>>> of every Name=gpu ... in gres.conf. Would this already do it? And if not,
>>>> what can I do? I've got the feeling that I could solve my problem with 
>>>> SLURM
>>>> in many ways. We're using SLURM version 14.11.8.
>>>>
>>>> Looking forward to some answers!
>>>>
>>>> Best wishes,
>>>> Felix Willenborg
>>

Reply via email to