We aren't doing that.  I agree we probably should.  If you work out a
config, and don't mind doing so, please share it.


On Thu, Mar 3, 2016 at 3:11 AM, Daniel Letai <[email protected]> wrote:
> Correct me if I'm wrong, but I don't see any NUMA based reservation of the
> CPUs - Do you ensure that each reserved cpu is from a different socket, and
> GPU jobs affinity is to correct NUMA node?
>
>
> On 03/02/2016 12:30 AM, Lachele Foley wrote:
>
> We do exactly that.  We use the CPUs as the consumable resource rather
> than the GPUs for that reason.  We also limit memory use as needed.
> You might want to see the configuration issues we ran into and solved
> as recorded in the thread at the link below.
>
> https://groups.google.com/forum/#!topic/slurm-devel/x6VaKfrdH5Y
>
>
> On Tue, Mar 1, 2016 at 1:27 PM, John Desantis <[email protected]> wrote:
>
> Felix,
>
> Although I haven't run into a use-case like yours (yet), my initial
> thought was to use the flag "MaxCPUsPerNode" in your configuration:
>
> 'Maximum number of CPUs on any node available to all jobs from this
> partition.  This can be especially useful to schedule GPUs. For
> example  a  node can  be  associated  with  two Slurm partitions (e.g.
> "cpu" and "gpu") and the partition/queue "cpu" could be limited to
> only a subset of the node’s CPUs, insuring that one or more CPUs would
> be available to jobs in the "gpu" partition/queue.'
>
> HTH,
> John DeSantis
>
>
>
> 2016-03-01 9:05 GMT-05:00 Felix Willenborg
> <[email protected]>:
>
> Hey folks,
>
> I'm kind of new to SLURM and we're setting it up in our work group with our
> nodes. Our cluster contains per node 2 GPUs and 12 CPU cores.
>
> The GPUs are configured with gres like this :
> Name=gpu_mem Count=6143
> Name=gpu File=/dev/nvidia0
> Name=gpu File=/dev/nvidia1
> #Name=bandwidth count=4G
> (Somehow the bandwith plugin isn't available in the repository slurm and I'm
> getting error messages with that. That's why it's commented out. Is it even
> necessary?)
>
> The nodes are defined like that in the slurm.conf :
> [...]
> NodeName=node01 NodeAddr=<...> CPUs=12 RealMemory=128740 Sockets=2
> CoresPerSocket=6 ThreadsPerCore=1 State=UNKNOWN
> Gres=gpu:3,gpu_mem:12287#,bandwidth:4G
>
>
> We'd like to have a situation where one CPU is always available for one GPU
> and only can allocated with one GPU, because we often had the situation that
> reservations were made where all CPUs were allocated and we couldn't use the
> GPUs anymore. I searched on the internet and didn't find any similiar cases
> which could help me. The only thing I found was adding "CPUS=0,1" at the end
> of every Name=gpu ... in gres.conf. Would this already do it? And if not,
> what can I do? I've got the feeling that I could solve my problem with SLURM
> in many ways. We're using SLURM version 14.11.8.
>
> Looking forward to some answers!
>
> Best wishes,
> Felix Willenborg
>
>
>



-- 
:-) Lachele
Lachele Foley
CCRC/UGA
Athens, GA USA

Reply via email to