[slurm-dev] Re: GPU node allocation policy

Ryan Cox Tue, 07 Apr 2015 07:40:32 -0700

You can do something like this: JobSubmitPlugins=all_partitions,lua.Have a special empty partition, as you suggest. Use the submit pluginto detect if the empty partition is in there. If it is in the job'slist of partitions, you know that the user didn't specify a particularpartition. If it is not in the list, you know that the user requested aparticular partition (or set of partitions). You can then do all sortsof fun logic.

Does all the GPU code in question need only one CPU core? Some of ourusers have code that can use multiple CPUs and multiple GPUssimultaneously (LAMMPS? NAMD? I'd have to check...). It might belimiting to restrict users to a certain amount of cores. If you'rescheduling memory, it's also important to make sure that there is somememory available for the GPU jobs.

What we do is uses QOSs to control access to our GPU partition withAllowQos. We use a job submit plugin to place jobs with the appropriateGRES into the gpu QOS, which is allowed into that partition. We alsoallow jobs in a preemptable QOS into the partition, with the gpu QOSable to preempt jobs in the preemptible QOS. We could also do a shorterwalltime QOS or something with a lower priority but haven't done so; GPUjobs could get on there quickly even if all-CPU jobs are on there. Theycould also have the job submit plugin add the gpu partition into theirlist of partitions if the job meets certain criteria even if notrequesting GPUs (short walltime or something else). Just some thoughts.


Ryan

On 04/07/2015 07:47 AM, Aaron Knister wrote:

Ah, I was wondering about that. You could try this:

Rename standard partition to cpu1
Create a partition called standard with no nodes
Use the lua submit plugin to rewrite the partition list from standard to 
cpu1,cpufromgpunode

I *think* that will work. I'm not sure about the empty partition piece and 
whether that will deny your submission before the submit filter  kicks in but 
my gut says no.

Sent from my iPhone

On Apr 7, 2015, at 9:18 AM, Schmidtmann, Carl <[email protected]> 
wrote:

That only works if ALL the nodes have GPUs. We have 200+ nodes and 30 of them 
have GPUs. So we have to create three partitions - standard, gpu and  
cpufromgpunode. People in the standard partition can’t use the cpus on the gpu 
nodes. People that submit to the cpufromgpunode can’t use the cpus in the 
standard partition. We would like to see a way to specify 
MaxCPUsPerJobOnThisNode so the standard partition can use 24 cores on nodes 
without a GPU and less on nodes with a GPU. Or a way to specify 
ReserveCPUForGPU on the node or some such thing. I assume this is difficult 
because people have asked for it but it hasn’t been implemented.

Carl

Carl Schmidtmann
Center for Integrated Research Computing
University of Rochester

On Apr 7, 2015, at 4:51 AM, Aaron Knister <[email protected]> wrote:

Would MaxCPUsPerNode set at the partition level help?

Here's the snippet from the man page:

MaxCPUsPerNode
Maximum number of CPUs on any node available to all jobs from this partition. This can be especially useful to schedule 
GPUs. For example a node can be associated with two Slurm partitions (e.g. "cpu" and "gpu") and the 
partition/queue "cpu" could be limited to only a subset of the node's CPUs, insuring that one or more CPUs 
would be available to jobs in the "gpu" partition/queue.

Sent from my iPhone

On Apr 6, 2015, at 11:25 PM, Novosielski, Ryan <[email protected]> wrote:

I am imagine part of the reason is to keep people from running CPU jobs that 
would take more than 20 cores on the GPU machine as others do not have GPU's. 
I'd be interested in knowing strategies here too.

____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
|| \\UTGERS      |---------------------*O*---------------------
||_// Biomedical | Ryan Novosielski - Senior Technologist
|| \\ and Health | [email protected] 973/972.0922 (2x0922)
||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
    `'

On Apr 6, 2015, at 20:17, Ryan Cox <[email protected]> wrote:


Chris,

Just have GPU users request the numbers of CPU cores that they need and
don't lie to Slurm about the number of cores.  If a GPU user needs 4
cores and 4 GPUs, have them request that.  That leaves 20 cores for
others to use.

Ryan

On 04/06/2015 03:43 PM, Christopher B Coffey wrote:
Hello,

I’m curious how you handle the allocation of GPU’s and cores on GPU
systems in your cluster.  My new GPU system is 24 core, with 2 Tesla K80’s
(4 gpus total).  We allocate cores/mem by:

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory


What I’m thinking of doing is lying to Slurm about the true cores, and
specifying CPUs=20, along with Gres=gpu:tesla:4.  Is this a reasonable
solution in order to ensure there is a core reserved for each gpu in the
system?  My thought is to allocate the 20 cores on the system to non-GPU
type work instead of leaving them idle.

Thanks!

Chris


--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University

[slurm-dev] Re: GPU node allocation policy

Reply via email to