On 12/15/2017 10:53 AM, Bill Oconnor wrote:
> The recovering server has a much larger swap usage than the other servers in 
> the cluster. We think this this related to the mmap files used for indexes. 
> The server eventually recovers but it triggers alerts for devops which are 
> annoying.
>
> I have found a previous mail  list question (Shawn responded to) with almost 
> an identical problem from 2014 but there is no suggested remedy. ( 
> http://lucene.472066.n3.nabble.com/Solr-4-3-1-memory-swapping-td4126641.html)

Solr itself cannot influence swap usage.  This is handled by the
operating system.  I have no idea whether Java can influence swap usage,
but if it can, this is also outside our control and I have no idea what
to tell you.  My guess is that Java is unlikely to influence swap usage,
but only a developer on the JDK team could tell you that for sure.

Assuming we're dealing with a Linux machine, my recommendation would be
to set vm.swappiness to 0 or 1, so that the OS is not aggressively
deciding that data should be swapped.  The default for vm.swappiness on
Linux is 60, which is quite aggressive.

> Questions :
>
> Is there progress regarding this?

As mentioned previously, there's nothing Solr or Lucene can do to help,
because it's software completely outside our ability to influence.  The
operating system will make those decisions.

If your software tries to use more memory than the machine has, then
swap is going to get used no matter how the OS is configured, and when
that happens, performance will suffer greatly.  In the case of
SolrCloud, it would make basic operation go so slowly that timeouts
would get exceeded, and Solr would initiate recovery.

If the OS is Linux, I would like to see a screenshot from the "top"
program.  (not htop, or anything else, the program needs to be top). 
Run the program, press shift-M to sort the list by memory, and grab a
screenshot.

Thanks,
Shawn

Reply via email to