I have a cluster with three kinds of memory per node: 4G/core, 8G/core and 6.5G/core.

In the slurm.conf file, the DefMemPerCPU=4000 to account for the worst case. I define RealMemory to be the actual memory in my NodeName definitions.

Also defined in slurm.conf are:

SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK

so we can allocate according to job script requirements.

So far, so good.

Lets suppose that a user requests a single core with a --mem=8000 requirement. There are lots of options about which node this might be scheduled on so my question is how do you account for this? Should I even bother?

In the past, using Torque, we would require users to request enough cores to cover the memory usage plus add an attribute like :mem48 to distinguish which nodes to choose from the pool. Naturally they would either get this wrong, not allocate enough, or not care! But this was important when it came to doing system accounting as we calculated this value strictly from core usage.

With Slurm, the consumable resources seems to work just as expected. Using cgroups limits users to exactly what they requested and is a wonderful feature. But this changes the way that we will need to do accounting and I am just looking for advice on guidance on how others are doing so.

Thanks,
Bill

Reply via email to