Just like I thought. This isn't currently fixable without external termination of the forked process. The forked JVM ran out of memory and deadlocked.
[junit4] WARN: Unhandled exception in event serialization. -> java.lang.OutOfMemoryError: Java heap space [junit4] WARN: Unreachable code. Complete panic. -> java.lang.OutOfMemoryError: Java heap space What's interesting about it, sadly, is the JVM should have quit, but didn't. How the OOM is re-thrown from the finally block in this code is a mystery to me, but like I said low-memory conditions are uncharted territory and there's very little one can "assume" about them from Java level. [...] if (reason != null) { try { SlaveMain.warn("Unhandled exception in event serialization.", reason); } finally { Runtime.getRuntime().halt(0); } } } } catch (Throwable t) { SlaveMain.warn("Unreachable code. Complete panic.", t); } Dawid On Wed, Oct 26, 2016 at 7:45 PM, Dawid Weiss <dawid.we...@gmail.com> wrote: > Uwe has collected stack traces, I will analyze later. Thanks for the > ping, Kevin. > > Dawid > > On Wed, Oct 26, 2016 at 7:32 PM, Dawid Weiss <dawid.we...@gmail.com> wrote: >> Uwe, can you try to take a jps (or send a signal) to the forked JVM >> (and the master)? Thanks! >> >> Dawid >> >> On Wed, Oct 26, 2016 at 5:06 PM, Kevin Risden <compuwizard...@gmail.com> >> wrote: >>> Looks like nightly 6.x is stalled in the same way. >>> https://builds.apache.org/job/Lucene-Solr-NightlyTests-6.x/185 >>> >>> Typically this takes ~4-6 hours and it is on 23+ and counting. >>> >>> Kevin Risden >>> >>> On Mon, Oct 10, 2016 at 12:08 PM, Dawid Weiss <dawid.we...@gmail.com> wrote: >>>> >>>> Thanks Uwe, this helps a lot! >>>> >>>> There is a resource deadlock here (an interplay of loggers, sysouts >>>> and junit4 stream redirectors and uncaught exception handlers...). >>>> It's really complex, but I'll try to get to the bottom of it. >>>> >>>> This completely aside, over 40 THOUSAND threads are hanging inside >>>> jetty's http handlers... there should be a more reasonable limit to >>>> this I guess?! >>>> >>>> "qtp1445698227-45502" #45502 prio=5 os_prio=0 tid=0x00007f5f5447c000 >>>> nid=0x4ec1 waiting for monitor entry [0x00007f5f26327000] >>>> java.lang.Thread.State: BLOCKED (on object monitor) >>>> at org.apache.log4j.Category.callAppenders(Category.java:204) >>>> - waiting to lock <0x00000000e00a8348> (a org.apache.log4j.spi.RootLogger) >>>> at org.apache.log4j.Category.forcedLog(Category.java:391) >>>> at org.apache.log4j.Category.log(Category.java:856) >>>> at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:497) >>>> at org.apache.solr.common.SolrException.log(SolrException.java:159) >>>> at >>>> org.apache.solr.servlet.ResponseUtils.getErrorInfo(ResponseUtils.java:65) >>>> >>>> Dawid >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: dev-h...@lucene.apache.org >>>> >>> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org