[slurm-dev] Re: heterogeneous cluster -- memory and accounting

jette Fri, 28 Mar 2014 10:56:41 -0700

You are correct that the fair-share calculation is currently basedonly on CPU allocations. We plan to put in place a framework that willpermit charging for additional resources, probably available late in2014 with the next release. Charges would be based upon configurableweight factors applied to multiple resources: CPUs, memory, powerconsumption, licenses, and generic resources (GRES, e.g. GPUs).


Moe Jette
SchedMD

Quoting Bill Wichser <[email protected]>:

Moe helped offline in understanding some of the ways this could work.
By defining DefMemPerCPU= in slurm.conf then users who submit jobswould get this value assigned. By also adding MaxMemPerCPU= thenwhen a user requested more than a core's associated memory, theactual core count would be increased accordingly to cover therequested memory.
Internal discussions came to the conclusion that this may not be thecorrect approach. In one example, suppose a user requested a singlecore and half the memory. Since we share nodes, other jobsrequesting much less memory but require cores could also share thisnode. Foe a 12 core node that would leave 11 cores still availableprovided their memory use would fit under the remaining 50%.
So this brings up the question, again, of exactly what constitutes anode. I can count a 4 dimensional aspect: cores, memory, localdisk, IB bandwidth. Memory and core counts are easily determined.The other two not so easily. So lets take the easy way out and justconsider core and memory.
On a machine with 20 core, to make the math easier, I can take theabove example:
job1 - 1 core, half the memory
This job uses 50% of the memory and 5% of the cores

job2 uses 19 cores and the other 50% of the memory
This jobs uses 50% of the memory and 95% of the cores.

Normalizing over the two values for cores and memory, I could calculate that
job1=27.5% of the machine ((50+5)/2) and
job2=72.5%
so I believe we have a solution based on two parameters which willsatisfy at least a query of utilization of a node which is alwaysthe question asked from the management above. I have no idea ifanything like this gets figured into the fairshare components ofSlurm but suspect that it is just using CPU time as the reference.
Bill

On 03/19/2014 02:37 PM, Bill Wichser wrote:
I have a cluster with three kinds of memory per node: 4G/core, 8G/core
and 6.5G/core.

In the slurm.conf file, the DefMemPerCPU=4000 to account for the worst
case.  I define RealMemory to be the actual memory in my NodeName
definitions.

Also defined in slurm.conf are:

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK

so we can allocate according to job script requirements.

So far, so good.

Lets suppose that a user requests a single core with a --mem=8000
requirement.  There are lots of options about which node this might be
scheduled on so my question is how do you account for this? Should I
even bother?

In the past, using Torque, we would require users to request enough
cores to cover the memory usage plus add an attribute like :mem48 to
distinguish which nodes to choose from the pool.  Naturally they would
either get this wrong, not allocate enough, or not care!  But this was
important when it came to doing system accounting as we calculated this
value strictly from core usage.

With Slurm, the consumable resources seems to work just as expected.
Using cgroups limits users to exactly what they requested and is a
wonderful feature.  But this changes the way that we will need to do
accounting and I am just looking for advice on guidance on how others
are doing so.

Thanks,
Bill

[slurm-dev] Re: heterogeneous cluster -- memory and accounting

Reply via email to