On 6/3/2015 12:20 AM, Clemens Wyss DEV wrote:
> Context: Lucene 5.1, Java 8 on debian. 24G of RAM whereof 16G available for 
> Solr.
> 
> I am seeing the following OOMs:
> ERROR - 2015-06-03 05:17:13.317; [   customer-1-de_CH_1] 
> org.apache.solr.common.SolrException; null:java.lang.RuntimeException: 
> java.lang.OutOfMemoryError: Java heap space

<snip>

> Caused by: java.lang.OutOfMemoryError: Java heap space
> WARN  - 2015-06-03 05:17:13.319; [   customer-1-de_CH_1] 
> org.eclipse.jetty.servlet.ServletHandler; Error for 
> /solr/customer-1-de_CH_1/suggest_phrase
> java.lang.OutOfMemoryError: Java heap space
> 
> The full commandline is
> /usr/local/java/bin/java -server -Xss256k -Xms16G
> -Xmx16G -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 
> -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
> -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark 
> -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 
> -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc 
> -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution 
> -XX:+PrintGCApplicationStoppedTime -Xloggc:/opt/solr/logs/solr_gc.log 
> -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks -Duser.timezone=UTC 
> -Dsolr.solr.home=/opt/solr/data -Dsolr.install.dir=/usr/local/solr 
> -Dlog4j.configuration=file:/opt/solr/log4j.properties
> -jar start.jar -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh 8983 
> /opt/solr/logs OPTIONS=default,rewrite
> 
> So I'd expect /usr/local/solr/bin/oom_solr.sh tob e triggered. But this does 
> not seem to "happen". What am I missing? Is it o to pull a heapdump from Solr 
> before killing/rebooting in oom_solr.sh?
> 
> Also I would like to know what query parameters were sent to 
> /solr/customer-1-de_CH_1/suggest_phrase (which may be the reason fort he OOM 
> ...

The oom script just kills Solr with the KILL signal (-9) and logs the
kill.  That's it.  It does not attempt to make a heap dump.  If you
*want* to dump the heap on OOM, you can, with some additional options:

http://stackoverflow.com/questions/542979/using-heapdumponoutofmemoryerror-parameter-for-heap-dump-for-jboss/20496376#20496376

I don't know if a heap dump on OOM is compatible with the OOM script.
If Java chooses to run the OOM script before the heap dump is done, the
process will be killed before the heap finishes dumping.

FYI, the stacktrace on the OOM error, especially in a multi-threaded app
like Solr, will frequently be completely useless in tracking down the
problem.  The thread that makes the triggering memory allocation may be
completely unrelated.  This error happened on a suggest handler ... but
the large memory allocations may be happening in a completely different
part of the code.

We have not had any recent indications of a memory leak in Solr.  Memory
leaks in Solr *do* happen, but they are usually caught by the tests.
which run in a minimal memory space.  The project has continuous
integration servers set up that run all the tests many times per day.

If you are running out of heap with 16GB allocated, then either your
Solr installation is enormous or you've got a configuration that's not
tuned properly.  With a very large Solr installation, you may need to
simply allocate more memory to the heap ... which may mean that you'll
need to install more memory in the server.  The alternative would be
figuring out where you can change your configuration to reduce memory
requirements.

Here's some incomplete info on settings and situations that can require
a very large heap:

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

To provide much help, we'll need lots of details about your system ...
number of documents in all cores, total index size on disk, your config,
possibly your schema, and maybe a few other things I haven't thought of yet.

Thanks,
Shawn

Reply via email to