Hi all,
last paragraph of http://slurm.schedmd.com/cons_res_share.html states that
enforcement of memory allocation limits needs to be
done by setting appropriate system limits (I assume by using "ulimit").
I am unsure how this can be implemented. If I call "ulimit -d
$((SLURM_MEM_PER_CPU * SLURM_NTASKS_PER_NODE * 1024))" in the
PrologSlurmctld script, will that limit still be active when the user's job
executes? Will the limit be reset after the job ended?
I also don't understand how TaskPlugin=task/cgroup would also limit the allowed
amount of RAM used *per job*. The parameters in
cgroup.conf seem to global limits (AllowedRAMSpace, ConstrainRAMSpace,
MaxRAMPercent).
This is all in a setup with shared resources (Shared=YES, ExclusiveUser=YES).
Could someone help me understand how I can implement that memory allocation
limits are enforced on job level on a shared node?
Regards,
Uwe