>From the lucene side, it only uses file mappings for reads and doesn't allocate any anonymous memory. The way lucene uses cache for reads won't impact your OOM (http://www.linuxatemyram.com/play.html)
At the end of the day you are running out of memory on the system either way, and your process might just look like a large target based for the oom-killer, but that doesn't mean its necessarily your problem at all. I advise sticking with basic operating system tools like /proc and free -m, reproduce the OOM kill situation, just like in that example link above, and try to track down the real problem. On Wed, Aug 30, 2017 at 11:43 PM, Erik Stephens <mreriksteph...@gmail.com> wrote: > Yeah, apologies for that long issue - the netty comments aren't related. My > two comments near the end might be more interesting here: > > > https://github.com/elastic/elasticsearch/issues/26269#issuecomment-326060213 > > To try to summarize, I looked to `/proc/$pid/smaps | grep indices` to > quantify what I think is mostly lucene usage. Is that an accurate way to > quantify that? It shows 51G with `-XX:MaxDirectMemorySize=15G`. The heap is > 30G and the resident memory is reported as 82.5G. That makes a bit of sense: > 30G + 51G + miscellaneous. > > `top` reports roughly 51G as shared which is suspiciously close to what I'm > seeing in /proc/$pid/smaps. Is it correct to think that if a process requests > memory and there is not enough "free", then the kernel will purge from its > cache in order to allocate that requested memory? I'm struggling to see how > the kernel thinks there isn't enough free memory when so much is in its > cache, but that concern is secondary at this point. My primary concern is > trying to regulate the overall footprint (shared with file system cache or > not) so that OOM killer not even part of the conversation in the first place. > > # grep Vm /proc/$pid/status > VmPeak: 982739416 kB > VmSize: 975784980 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 86555044 kB > VmRSS: 86526616 kB > VmData: 42644832 kB > VmStk: 136 kB > VmExe: 4 kB > VmLib: 18028 kB > VmPTE: 275292 kB > VmPMD: 3720 kB > VmSwap: 0 kB > > # free -g > total used free shared buff/cache > available > Mem: 125 54 1 1 69 > 69 > Swap: 0 0 0 > > Thanks for the reply! Apologies if not apropos to this forum - just working > my way down the rabbit hole :) > > -- > Erik > > >> On Aug 30, 2017, at 8:04 PM, Robert Muir <rcm...@gmail.com> wrote: >> >> Hello, >> >> From the thread linked there, its not clear to me the problem relates >> to lucene (vs being e.g. a bug in netty, or too many threads, or >> potentially many other problems). >> >> Can you first try to determine to breakdown your problematic "RSS" >> from the operating system? Maybe this helps determine if your issue is >> with an anonymous mapping (ByteBuffer.allocateDirect) or file mapping >> (FileChannel.map). >> >> WIth recent kernels you can break down RSS with /proc/pid/XXX/status >> (RssAnon vs RssFile vs RssShmem): >> >> http://man7.org/linux/man-pages/man5/proc.5.html >> >> If your kernel is old you may have to go through more trouble (summing >> up stuff from smaps or whatever) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org