On 01/12/2020 19:07, Christopher Black wrote:
CAUTION: This email originated outside the University. Check before clicking 
links or attachments.

We tune vm-related sysctl values on our gpfs clients.
These are values we use for 256GB+ mem hpc nodes:
vm.min_free_kbytes=2097152
vm.dirty_bytes = 3435973836
vm.dirty_background_bytes = 1717986918
 > The vm.dirty parameters are to prevent NFS from buffering huge
amounts of writes and then pushing them over the network all at once
flooding out gpfs traffic.
 > I'd also recommend checking client gpfs parameters pagepool and/or
pagepoolMaxPhysMemPct to ensure you have a reasonable and understood
limit for how much memory mmfsd will use.


We take a different approach and tackle it from the other end. Basically we use slurm to limit user processes to 4GB per core which we find is more than enough for 99% of jobs. For people needing more then there are some dedicated large memory nodes with 3TB of RAM. We have seen well over 1TB of RAM being used by a single user on occasion (generating large meshes usually). I don't think there is any limit on RAM on those nodes

The compute nodes are dual Xeon 6138 with 192GB of RAM, which works out at 4.8GB of RAM. Basically it stops the machines running out of RAM for *any* administrative tasks not just GPFS.

We did originally try running it closer to the wire but it appears anecdotally cgroups is not perfect and it is possible for users to get a bit over their limits, so we lowered it back down to 4GB per core. Noting that is what the tender for the machine was, but due to number of DIMM slots and and cores in the CPU, we ended up with a bit more RAM per core.

We have had no memory starvation issues now in ~2 years since we went down to 4GB per core for jobs.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to