Gareth,
Maybe I'm missing something or your configuration is different, but
/dev/shm is also controlled by cgroups. If you are using cgroups and
requiring users to request memory, the GRES setup shouldn't be necessary
since /dev/shm is accounted for like normal memory (though it's in a
different memory.stat field from rss).
For example:
$ srun -n 1 --mem=100M dd if=/dev/zero of=/dev/shm/DELETEME bs=1M count=500
slurmstepd: Exceeded step memory limit at some point. oom-killer likely
killed a process.
srun: error: m6-5-7: task 0: Killed
srun: Force Terminated job step 5401199.0
Ryan
On 01/05/2015 08:38 PM, [email protected] wrote:
Hi,
We are busy configuring a Gres for counting /dev/shm space (calling it 'memdir'
and not being too worried about enforcement, just separating jobs that request
it and need separation) and got caught out by a typo on
http://slurm.schedmd.com/gres.html where the example has GresType=gpu,bandwith
rather than GresTypes=...
Could you please fix the doc!
BTW. Slurm was quite ungracious about having that bad entry in slurm.conf
Regards,
Gareth
--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University