Basically build a socket based scheduling (using sockets instead of
cores), and build a gres configuration for the GPUs, 2 lines - 1 with
CPUs=0-9, the other with CPUs=10-19
see http://slurm.schedmd.com/gres.html
and http://slurm.schedmd.com/slurm.conf.html (search for CR_Socket)
I'm not sure if cr_socket means the cpus in gres.conf are still
cores/threads (as implied by documentation) or sockets (as implied by
common sense).
On 09/14/2015 07:38 PM, David McGiven wrote:
Dividing a 2cpu+2gpu machine into two independent blocks of 1cpu+1gpu
each.
Dear SLURM users,
We recently bought some machines with 2 Intel Xeon processors (10 core
each) and 2 GPU each.
For 90% of our cluster use, our jobs run well in up to 10 cores + 1
GPU, and for optimal performance all the cores requested must be
"pinned" to the same physical cpu.
Currently we are using a TORQUE+MAUI combination but I’m not sure it
has the features we need.
I would like to know if one could deploy a setup in SLURM like this :
Basically, I would like to divide each machine in two blocks of 1 cpu
(10 cores) and 1 GPU. So the user can ask SLURM for 1 or 2 blocks,
each block consisting on 10 cores and 1 GPU. If the user requests only
1 block, under no circumstances the job threads can be spread to the
two physical cpus or gpus.
For simplicity, there's no need for spreading jobs across nodes with
MPI or the like. All the jobs run locally on each server.
So a cluster of 10 of these machines will have 20 usable "blocks",
therefore 20 jobs maximum running simultaneously in the whole cluster.
When issuing a job, users would request up to 2 "blocks" and up to 10
cores and 1 gpu for each block.
I don't know if I'm overcomplicating this but this should be the ideal
scenario, or at least something very similar. I would prefer not to
use cgroups for this since it can complicate the setup. Ideally it
would be done only with SLURM.
Three examples : The user would ask the SLURM server :
- I need 1 block, and inside this block, 8 cores and 1 gpu.
The 2nd block of the node will remain free and totally independent
from the 1st one. SLURM would report 1 block free with 10 cores and 1
gpu free (although there are 12 free in the machine) and 1 gpu.
Practically, it isn't important if it reports 12 core free as long as
the user can effectively run only on the 10 cores of the 2nd cpu since
there’s only 1 block free.
- I need 1 block, and inside this block, 10 cores and no gpu
Same as before, the 2nd block will remain free and totally independent
from the 1st one, and new jobs could use only the 2nd cpu (10 cores)
and only the 2nd GPU.
- I need 2 blocks, and inside these blocks, 14 cores and 2 gpu.
The jobs will have access to the 2 cpus+2gpus. In this case the
machine won't accept new jobs because the 2 blocks are used. It can or
cannot list the 6 free cores free, this is irrelevant, but since there
are no free "blocks", the slurm node won't accept more jobs.
Any suggestions or advice would be really appreciated.
Best regards,
D