Frank Filz wrote on Thu, Oct 29, 2015 at 12:00:31PM -0700: > > In another situation the Linux the OOM killer might have already killed > > important other processes trying to free memory for the NFS server. You > > wouldn't want to recover the NFS process here, since you don't know if > that > > system as a whole is in a useable space any more. You really want to > reboot > > the entire server. > > Yea, if we don't constrain the maximum process memory, then rebooting the > entire server may be best. > > Ideally we don't want OOM killer to raise it's head (and probably on an > enterprise server, instead of OOM Killer, we just want the system to > reboot).
We used to adjust ganesha's OOM score so that it'd be killed first, but that's not exactly good practice.. > > What a system administrator wants, is a NFS-Ganesha configuration option > to > > limit the amount of memory (RAM) that the process (including all threads) > > can use at maximum. > > This is easy to do with ulimit -d, and I definitely agree we need to set > such a hard limit. > > Jemalloc may also have a mechanism that might be worth using also. [...] > The gsh_alloc function certainly could track memory usage. I'm not sure that > is worthwhile. For one thing, it would need to use an atomic variable to > track memory utilization. We can't just rely on gsh_alloc/free accounting, because we're not the only source of code that does allocations - we're using libs that do and sometimes we keep structures they alloc'd around, etc. I don't believe it's going to be reliable even if we try to account for these, really. ulimit, or nowadays cgroups limitations are the way to go if you're happy with just stopping ganesha from killing the system - especially since cgroups actually can account for the filesystems cache (e.g. VFS) and similar. For just "doing our best and purging stuff we don't really need right now/could recompute at a limit", just tracking the amount given out through gsh_alloc/free isn't enough... BUT if you can track by function e.g. this place alloc'd this much then we can start working on making the highest memory consumers into pools and these can have watermarks or whatsnot to trigger different behaviors ultimately. Either way memory management is far from a trivial problem (on the good side I'm happy with straight crashes, so as long as people don't bring too much complexity for this I should be happy anyway) -- Dominique ------------------------------------------------------------------------------ _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel