On looking at the code in SolrDispatchFilter, is this intentional or not?
 I think I remember Mark Miller mentioning that in an OOM case, the best
course of action is basically to kill the process, there is very little
Solr can do once it has run out of memory.  Yet it seems that Solr catches
the OOM itself and just logs it as an error, rather than letting it go back
up the to the JVM.

We have also seem OOMs in IndexWriter and that has specific code to handle
OOM cases, and seems to fall-back to the transaction log (but fail
committing anything).  I understand the logic of that, but in reality, I've
seen the tlog can get corrupted in this case, so we still need to be
monitoring the system and forcibly kill the process.



On 27 June 2013 00:03, Timothy Potter <thelabd...@gmail.com> wrote:

> Thanks for the feedback Daniel ... For now, I've opted to just kill
> the JVM with System.exit(1) in the SolrDispatchFilter code and will
> restart it with a Linux supervisor. Not elegant but the alternative of
> having a zombie Solr instance walking around my cluster is much worse
> ;-) Will try to dig into the code that is trapping this error but for
> now I've lost too many hours on this problem.
>
> Cheers,
> Tim
>
> On Wed, Jun 26, 2013 at 2:43 PM, Daniel Collins <danwcoll...@gmail.com>
> wrote:
> > Ooh, I guess Jetty is trapping that java.lang.OutOfMemoryError, and
> > throwing it/packaging it as a java.lang.RuntimeException.  The -XX option
> > assumes that the application doesn't handle the Errors and so they would
> > reach the JVM and thus invoke the handler.
> > Since Jetty has an exception handler that is dealing with anything
> > (included Errors), they never reach the JVM, hence no handler.
> >
> > Not much we can do short of not using Jetty?
> >
> > That's a pain, I'd just written a nice OOM handler too!
> >
> >
> > On 26 June 2013 20:37, Timothy Potter <thelabd...@gmail.com> wrote:
> >
> >> A little more to this ...
> >>
> >> Just on chance this was a weird Jetty issue or something, I tried with
> >> the latest 9.... and the problem still occurs :-(
> >>
> >> This is on Java 7 on debian:
> >>
> >> java version "1.7.0_21"
> >> Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
> >> Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
> >>
> >> Here is an example stack trace from the log
> >>
> >> 2013-06-26 19:31:33,801 [qtp632640515-62] ERROR
> >> solr.servlet.SolrDispatchFilter Q:22 -
> >> null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap
> >> space
> >> at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:670)
> >> at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
> >> at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
> >> at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1423)
> >> at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:450)
> >> at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:138)
> >> at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:564)
> >> at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:213)
> >> at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1083)
> >> at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:379)
> >> at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:175)
> >> at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1017)
> >> at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:136)
> >> at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:258)
> >> at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109)
> >> at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
> >> at org.eclipse.jetty.server.Server.handle(Server.java:445)
> >> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:260)
> >> at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:225)
> >> at
> >>
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358)
> >> at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:596)
> >> at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:527)
> >> at java.lang.Thread.run(Thread.java:722)
> >> Caused by: java.lang.OutOfMemoryError: Java heap space
> >>
> >> On Wed, Jun 26, 2013 at 12:27 PM, Timothy Potter <thelabd...@gmail.com>
> >> wrote:
> >> > Recently upgraded to 4.3.1 but this problem has persisted for a while
> >> now ...
> >> >
> >> > I'm using the following configuration when starting Jetty:
> >> >
> >> > -XX:OnOutOfMemoryError="/home/solr/oom_killer.sh 83 %p"
> >> >
> >> > If an OOM is triggered during Solr web app initialization (such as by
> >> > me lowering -Xmx to a value that is too low to initialize Solr with),
> >> > then the script gets called and does what I expect!
> >> >
> >> > However, once the Solr webapp initializes and Solr is happily
> >> > responding to updates and queries. When an OOM occurs in this
> >> > situation, then the script doesn't actually get invoked! All I see is
> >> > the following in the stdout/stderr log of my process:
> >> >
> >> > #
> >> > # java.lang.OutOfMemoryError: Java heap space
> >> > # -XX:OnOutOfMemoryError="/home/solr/oom_killer.sh 83 %p"
> >> > #   Executing /bin/sh -c "/home/solr/oom_killer.sh 83 21358"...
> >> >
> >> > The oom_killer.sh script doesn't actually get called!
> >> >
> >> > So to recap, it works if an OOM occurs during initialization but once
> >> > Solr is running, the OOM killer doesn't fire correctly. This leads me
> >> > to believe my script is fine and there's something else going wrong.
> >> > Here's the oom_killer.sh script (pretty basic):
> >> >
> >> > #!/bin/bash
> >> > SOLR_PORT=$1
> >> > SOLR_PID=$2
> >> > NOW=$(date +"%Y%m%d_%H%M")
> >> > (
> >> > echo "Running OOM killer script for process $SOLR_PID for Solr on port
> >> > 89$SOLR_PORT"
> >> > kill -9 $SOLR_PID
> >> > echo "Killed process $SOLR_PID"
> >> > exec /home/solr/solr-dg/dg-solr.sh recover $SOLR_PORT &
> >> > echo "Restarted Solr on 89$SOLR_PORT after OOM"
> >> > ) | tee oom_killer-89$SOLR_PORT-$NOW.log
> >> >
> >> > Anyone see anything like this before? Suggestions on where to begin
> >> > tracking down this issue?
> >> >
> >> > Cheers,
> >> > Tim
> >>
>

Reply via email to