Hi

Thanks for this I will investigate further after reading a number of your 
points in more detail, I do have a feeling they've setup too many entries in 
the filter cache (1000s) so will revisit that.

Just a note on numbers, those were valid when I made the post but obviously 
they change as the week progresses before a regular clean-up of content, 
current numbers for info (if it's at all relevant) from the index admin view on 
one of the 2 nodes is:

Last Modified:  18 minutes ago
Num Docs:       24590368
Max Doc:        29139255
Deleted Docs:   4548887
Version:                1297982
Segment Count:  28
        
                   Version          Gen         Size
Master:         1412798583558 402364 52.98 GB

Top:
2996 tomcat6   20   0  189g  73g 1.5g S   15 58.7  58034:04 java

And the only GC option I can see that is on is "- XX:+UseConcMarkSweepGC"

Regarding the XY problem, you are very likely correct, unfortunately I wasn't 
involved in the config and I very much suspect when it was done many of the 
defaults were used and then if it didn't work or there was say an out of memory 
error they just upped the heap to solve the symptom without investigating the 
cause. The luxury of having more than enough RAM I guess!

I'm going to get some late night downtime soon at which point I'm hoping to 
change the heap size, GC settings and add the JMX, it's not exposed to the 
internet so no security is fine.

Right off to do some reading!

Cheers

Si

-----Original Message-----
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: 08 October 2014 21:09
To: solr-user@lucene.apache.org
Subject: Re: Solr configuration, memory usage and MMapDirectory

On 10/8/2014 4:02 AM, Simon Fairey wrote:
> I'm currently setting up jconsole but as I have to remotely monitor (no gui 
> capability on the server) I have to wait before I can restart solr with a JMX 
> port setup. In the meantime I looked at top and given the calculations you 
> said based on your top output and this top of my java process from the node 
> that handles the querying, the indexing node has a similar memory profile:
> 
> https://www.dropbox.com/s/pz85dm4e7qpepco/SolrTop.png?dl=0
> 
> It would seem I need a monstrously large heap in the 60GB region?
> 
> We do use a lot of navigators/filters so I have set the caches to be quite 
> large for these, are these what are using up the memory?

With a VIRT size of 189GB and a RES size of 73GB, I believe you probably have 
more than 45GB of index data.  This might be a combination of old indexes and 
the active index.  Only the indexes (cores) that are being actively used need 
to be considered when trying to calculate the total RAM needed.  Other indexes 
will not affect performance, even though they increase your virtual memory size.

With MMap, part of the virtual memory size is the size of the index data that 
has been opened on the disk.  This is not memory that's actually allocated.  
There's a very good reason that mmap has been the default in Lucene and Solr 
for more than two years.

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

You stated originally that you have 25 million document and 45GB of index data 
on each node.  With those numbers and a conservative configuration, I would 
expect that you need about 4GB of heap, maybe as much as 8GB.  I cannot think 
of any reason that you would NEED a heap 60GB or larger.

Each field that you sort on, each field that you facet on with the default 
facet.method of fc, and each filter that you cache will use a large block of 
memory.  The size of that block of memory is almost exclusively determined by 
the number of documents in the index.

With 25 million documents, each filterCache entry will be approximately 3MB -- 
one bit for every document.  I do not know how big each FieldCache entry is for 
a sort field and a facet field, but assume that they are probably larger than 
the 3MB entries on the filterCache.

I've got a filterCache sized at 64, with an autowarmCount of 4.  With larger 
autowarmCount values, I was seeing commits take 30 seconds or more, because 
each of those filters can take a few seconds to execute.
Cache sizes in the thousands are rarely necessary, and just chew up a lot of 
memory with no benefit.  Large autowarmCount values are also rarely necessary.  
Every time a new searcher is opened by a commit, add up all your autowarmCount 
values and realize that the searcher likely needs to execute that many queries 
before it is available.

If you need to set up remote JMX so you can remotely connect jconsole, I have 
done this in the redhat init script I've built -- see JMX_OPTS here:

http://wiki.apache.org/solr/ShawnHeisey#Init_script

It's never a good idea to expose Solr directly to the internet, but if you use 
that JMX config, *definitely* don't expose it to the Internet.
It doesn't use any authentication.

We might need to back up a little bit and start with the problem that you are 
trying to figure out, not the numbers that are being reported.

http://people.apache.org/~hossman/#xyproblem

Your original note said that you're sanity checking.  Toward that end, the only 
insane thing that jumps out at me is that your max heap is
*VERY* large, and you probably don't have the proper GC tuning.

My recommendations for initial action are to use -Xmx8g on the servlet 
container startup and include the GC settings you can find on the wiki pages 
I've given you.  It would be a very good idea to set up remote JMX so you can 
use jconsole or jvisualvm remotely.

Thanks,
Shawn


Reply via email to