Hi Uwe,

If you can attach gdb onto it, and jstack -m and jstack -F should also
work; that'll get you the Java stack trace.
(But it probably doesn't matter in this case, because the hang is
probably bug in the VM).

- Kris

On Wed, Mar 6, 2013 at 5:48 AM, Uwe Schindler <uschind...@apache.org> wrote:
> Hi,
>
> since a few month we are extensively testing various preview builds of JDK 8 
> for compatibility with Apache Lucene and Solr, so we can find any bugs early 
> and prevent the problems we had with the release of Java 7 two years ago. 
> Currently we have a Linux (Ubuntu 64bit) Jenkins machine that has various 
> JDKs (JDK 6, JDK 7, JDK 8 snapshot, IBM J9, older JRockit) installed, 
> choosing a different one with different hotspot and garbage collector 
> settings on every run of the test suite (which takes approx. 30-45 minutes).
>
> JDK 8 b79 works so far very well on Linux, we found some strange behavior in 
> early versions (maybe compiler errors), but no longer at the moment. There is 
> one configuration that constantly and reproducibly hangs in one module that 
> is tested: The configuration uses JDK 8 b79 (same for b78), 32 bit, and G1GC 
> (server or client does not matter). The JVM running the tests hangs 
> irresponsible (jstack or kill -3 have no effect/cannot connect, standard kill 
> does not stop it, only kill -9 actually kills it). It can be reproduced in 
> this Lucene module 100% (it hangs always).
>
> I was able to connect with GDB to the JVM and get a stack trace on all 
> threads (see attachment, dump.txt). As you see all threads of G1GC seem to 
> hang in a syscall (os:park(), a conditional wait in pthread library). 
> Unfortunately that’s all I can give you. A Java stacktrace is not possible 
> because the JVM reacts on neither kill -3 nor jstack. With all other garbage 
> collectors it passes the test without hangs in a few seconds, with 32 bit 
> G1GC it can stand still for hours. The 64 bit JVM passes with G1GC, so only 
> the 32 bit variant is affected. Client or Server VM makes no difference.
>
> To reproduce:
> - Use a 32 bit JDK 8 b78 or b79 (tested on Linux 64 bit, but this should not 
> matter)
> - Download Lucene Source code (e.g. the snapshot version we were testing 
> with: 
> https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/)
> - change to directory lucene/analysis/uima and run:
>         ant -Dargs="-server -XX:+UseG1GC" -Dtests.multiplier=3 -Dtests.jvms=1 
> test
> After a while the test framework prints "stalled" messages (because the child 
> VM actually running the test no longer responds). The PID is also printed. 
> Try to get a stack trace or kill it, no response. Only kill -9 helps. 
> Choosing another garbage collector in the above command line makes the test 
> finish after a few seconds, e.g. -Dargs="-server -XX:+UseConcMarkSweepGC"
>
> I posted this bug report directly to the mailing list, because with earlier 
> bug reports, there seem to be a problem with bugs.sun.com - there is no 
> response from any reviewer after several weeks and we were able to help to 
> find and fix javadoc and javac-compiler bugs early. So I hope you can help 
> for this bug, too.
>
> Uwe
>
> -----
> Uwe Schindler
> uschind...@apache.org
> Apache Lucene PMC Member / Committer
> Bremen, Germany
> http://lucene.apache.org/
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to