Gareth,

Maybe I'm missing something or your configuration is different, but /dev/shm is also controlled by cgroups. If you are using cgroups and requiring users to request memory, the GRES setup shouldn't be necessary since /dev/shm is accounted for like normal memory (though it's in a different memory.stat field from rss).

For example:
$ srun -n 1 --mem=100M dd if=/dev/zero of=/dev/shm/DELETEME bs=1M count=500
slurmstepd: Exceeded step memory limit at some point. oom-killer likely killed a process.
srun: error: m6-5-7: task 0: Killed
srun: Force Terminated job step 5401199.0

Ryan

On 01/05/2015 08:38 PM, [email protected] wrote:
Hi,

We are busy configuring a Gres for counting /dev/shm space (calling it 'memdir' 
and not being too worried about enforcement, just separating jobs that request 
it and need separation) and got caught out by a typo on 
http://slurm.schedmd.com/gres.html where the example has GresType=gpu,bandwith 
rather than GresTypes=...

Could you please fix the doc!

BTW. Slurm was quite ungracious about having that bad entry in slurm.conf

Regards,

Gareth

--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University

Reply via email to