Don't have a bunch of time to look at this but mlockall can be dangerous and may have an impact which you dont' understand. Specifically, it could cause the OOM killer to jump in.
We've mitigated these by using noswap kernels but on more moderl machines we've been doing numactl and binding it to a specific core... then running one daemon per core. This isn't ES specific but look at this: http://www.percona.com/doc/percona-server/5.5/performance/innodb_numa_support.html you may be running out of memory on one numa node , and then kswapd spends all its time trying to figure out a way to solve it. You can use interleave but this means that memory is going to be used on other cores which isn't super fast. It can be easier to setup though. Kevin On Thursday, October 9, 2014 10:23:59 AM UTC-7, Michael deMan (ES) wrote: > > Hi All, > > This is a bit off topic, but we only see this on some of our elastic > search hosts, and it is also the only place where we enable mlockall for > java which is our understanding is a strongly recommended best practice. > > Basically we from time to time see kswapd run away at 100% on a single > core. > > It seems to hit our master nodes more frequently, and they also have the > least amount of memory. > masters are: > CentOS 6.4 > 4GB RAM > 4GB swap > ES_HEAP_SIZE=2908m > > > > Does anybody know much about this and how to prevent it? > We have hunted google groups, but have not really found the magic bullet. > > We have considered turning off swap and seeing what happens in the lab but > prefer not to do that unless it is well known as the correct solution. > > Thanks, > - Mike > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12e39b36-a893-42cb-a577-3a41a6a34282%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
