[slurm-users] Gentle memory limits in Slurm using cgroup?

Alexander Åhman Thu, 02 May 2019 07:55:46 -0700

Hi,

Is it possible to configure slurm/cgroups in such way that jobs that areusing more memory than they asked for are not killed if there still arefree memory available on the compute node? When free memory gets lowthese jobs can be killed as usual.

Today when a job has exceeded its limits it is killed immediately. Sincethe applications only requires maximum memory for a short period of timewe can often not run as many concurrent jobs as we want.

Maybe I can rephrase the question a bit: How can you configure memorylimits for a job when the job only needs maximum memory during a shorttime? Example: Job1 needs 80G RAM but only during 15% of the executiontime, during the remaining 85% it only needs 30G.

I guess the obvious thing is to use "CR_Core" instead of"CR_Core_Memory" we use today. But we have to constrain memory in someway because the nodes are also running daemons for the distributed filesystem and that must not be affected by running jobs.


Any ideas?

Regards,
Alexander

[slurm-users] Gentle memory limits in Slurm using cgroup?

Reply via email to