Hi Uwe, If you can attach gdb onto it, and jstack -m and jstack -F should also work; that'll get you the Java stack trace. (But it probably doesn't matter in this case, because the hang is probably bug in the VM).
- Kris On Wed, Mar 6, 2013 at 5:48 AM, Uwe Schindler <uschind...@apache.org> wrote: > Hi, > > since a few month we are extensively testing various preview builds of JDK 8 > for compatibility with Apache Lucene and Solr, so we can find any bugs early > and prevent the problems we had with the release of Java 7 two years ago. > Currently we have a Linux (Ubuntu 64bit) Jenkins machine that has various > JDKs (JDK 6, JDK 7, JDK 8 snapshot, IBM J9, older JRockit) installed, > choosing a different one with different hotspot and garbage collector > settings on every run of the test suite (which takes approx. 30-45 minutes). > > JDK 8 b79 works so far very well on Linux, we found some strange behavior in > early versions (maybe compiler errors), but no longer at the moment. There is > one configuration that constantly and reproducibly hangs in one module that > is tested: The configuration uses JDK 8 b79 (same for b78), 32 bit, and G1GC > (server or client does not matter). The JVM running the tests hangs > irresponsible (jstack or kill -3 have no effect/cannot connect, standard kill > does not stop it, only kill -9 actually kills it). It can be reproduced in > this Lucene module 100% (it hangs always). > > I was able to connect with GDB to the JVM and get a stack trace on all > threads (see attachment, dump.txt). As you see all threads of G1GC seem to > hang in a syscall (os:park(), a conditional wait in pthread library). > Unfortunately that’s all I can give you. A Java stacktrace is not possible > because the JVM reacts on neither kill -3 nor jstack. With all other garbage > collectors it passes the test without hangs in a few seconds, with 32 bit > G1GC it can stand still for hours. The 64 bit JVM passes with G1GC, so only > the 32 bit variant is affected. Client or Server VM makes no difference. > > To reproduce: > - Use a 32 bit JDK 8 b78 or b79 (tested on Linux 64 bit, but this should not > matter) > - Download Lucene Source code (e.g. the snapshot version we were testing > with: > https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/) > - change to directory lucene/analysis/uima and run: > ant -Dargs="-server -XX:+UseG1GC" -Dtests.multiplier=3 -Dtests.jvms=1 > test > After a while the test framework prints "stalled" messages (because the child > VM actually running the test no longer responds). The PID is also printed. > Try to get a stack trace or kill it, no response. Only kill -9 helps. > Choosing another garbage collector in the above command line makes the test > finish after a few seconds, e.g. -Dargs="-server -XX:+UseConcMarkSweepGC" > > I posted this bug report directly to the mailing list, because with earlier > bug reports, there seem to be a problem with bugs.sun.com - there is no > response from any reviewer after several weeks and we were able to help to > find and fix javadoc and javac-compiler bugs early. So I hope you can help > for this bug, too. > > Uwe > > ----- > Uwe Schindler > uschind...@apache.org > Apache Lucene PMC Member / Committer > Bremen, Germany > http://lucene.apache.org/ > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org