Hi everyone, My Solr JVM runs out of heap space quite frequently. I'm trying to understand Solr/Lucene's memory usage so I can address the problem correctly. Otherwise, I feel I'm taking random shots in the dark. I've tried previous troubleshooting suggestions. Here's what I've done: 1) Increased Tomcat's JVM heap space, e.g.: JAVA_OPTS='-Xmx1244m -Xms1244m -server'; # frequent heap space problems JAVA_OPTS='-XX:+AggressiveHeap -server'; # runs out of heap space at 2.0g JAVA_OPTS='-Xmx3072m -Xms3072m -server'; # jvm quickly hits 2.9g on 'top' Solr is the only webapp deployed on this Tomcat instance. 2) I use Solr collection/distribution to separate indexing and searching. The indexer is stable now and memory problems only occur when searching on the Solr slave. 3) In solrconfig.xml, I reduced mergeFactor and maxBufferedDocs by 50%: <mergeFactor>5</mergeFactor> <maxBufferedDocs>500</maxBufferedDocs> This helped the indexing server but not the Solr slave. 4) In solrconfig.xml, I set filterCache, queryResultCache, and documentCache to 0. Now for my index details: - To facilitate highlighting, I currently store doc contents in the index, so the index consumes 24GB on disk. - numDocs : 4,953,736 maxDoc : 4,953,736 (just optimized) - Term files: logs # du -ksh ../solr/data/index/*.t?? 5.9M ../solr/data/index/_1kjb.tii 429M ../solr/data/index/_1kjb.tis - I have 22 fields and yes, they currently have norms.
Other info that may be helpful: - My Solr is from 2006-11-15. We have a few mods, including one extra fieldCache that stores ~40 bytes/doc. - Thread counts from solr/admin/threaddump.jsp: Java HotSpot(TM) 64-Bit Server VM 1.5.0_08-b03 Thread Count: current=37 deamon=34 peak=37 My machine has Gentoo Linux and 4gb RAM. 'top' indicates the JVM reaches 2.9g RAM (3472m virtual memory) after 10-20 searches and ~20 mins of use. It seems just a matter of time before more searches or a snapinstaller 'commit' will make it run out of heap space again. I have flexibility in the changes we can make. I.e., I can omit norms for most fields, or I can stop storing the doc contents in the index. But before embarking on a new strategy, I need some assurance that the strategy will work (crazy, I know). For example, it doesn't seem that removing norms would save a great deal (I calculate saving 1 byte per norm per field on 21 fields is ~99MB). So...how do I deduce what's taking up so much memory? Any suggestions would be very helpful to me (and hopefully to others, too). many thanks, -Graham