On 6/03/2013 5:55 PM, Dawid Weiss wrote:

Here you go:
http://pastebin.com/raw.php?i=b2PHLm1e

Thanks. I would have to say this seems to be the suspicious part:

Thread 22 (Thread 0xf20ffb40 (LWP 22939)):
#0  0xf7743430 in __kernel_vsyscall ()
#1 0xf771e96b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xf6ec849c in os::PlatformEvent::park() ()
from /var/lib/jenkins/tools/java/32bit/jdk1.8.0-ea-b79/jre/lib/i386/server/libjvm.so
#3  0xf6e98b82 in Monitor::IWait(Thread*, long long) ()
from /var/lib/jenkins/tools/java/32bit/jdk1.8.0-ea-b79/jre/lib/i386/server/libjvm.so
#4  0xf6e99370 in Monitor::wait(bool, long, bool) ()
from /var/lib/jenkins/tools/java/32bit/jdk1.8.0-ea-b79/jre/lib/i386/server/libjvm.so
#5  0xf6b5fb16 in SuspendibleThreadSet::join() ()
from /var/lib/jenkins/tools/java/32bit/jdk1.8.0-ea-b79/jre/lib/i386/server/libjvm.so
#6  0xf6b5ea41 in ConcurrentG1RefineThread::run_young_rs_sampling() ()
from /var/lib/jenkins/tools/java/32bit/jdk1.8.0-ea-b79/jre/lib/i386/server/libjvm.so
#7  0xf6b5ef91 in ConcurrentG1RefineThread::run() ()

The suspendible thread set logic looks 'tricky". Time for the G1 experts to take over. :)

David

Dawid

On Wed, Mar 6, 2013 at 8:52 AM, David Holmes <david.hol...@oracle.com
<mailto:david.hol...@oracle.com>> wrote:

    If the VM is completely unresponsive then it suggests we are at a
    safepoint.

    The GC threads are not "hung" in os::parK, they are parked - waiting
    to be notified of something.

    The thing is to find out why they are not being woken up.

    Can the gdb log be posted somewhere? I don't know if the attachment
    made it to the original posting on hotspot-gc but it's no longer
    available on hotspot-dev.

    Thanks,
    David


    On 6/03/2013 4:07 PM, Krystal Mok wrote:

        Hi Uwe,

        If you can attach gdb onto it, and jstack -m and jstack -F
        should also
        work; that'll get you the Java stack trace.
        (But it probably doesn't matter in this case, because the hang is
        probably bug in the VM).

        - Kris

        On Wed, Mar 6, 2013 at 5:48 AM, Uwe Schindler
        <uschind...@apache.org <mailto:uschind...@apache.org>> wrote:

            Hi,

            since a few month we are extensively testing various preview
            builds of JDK 8 for compatibility with Apache Lucene and
            Solr, so we can find any bugs early and prevent the problems
            we had with the release of Java 7 two years ago. Currently
            we have a Linux (Ubuntu 64bit) Jenkins machine that has
            various JDKs (JDK 6, JDK 7, JDK 8 snapshot, IBM J9, older
            JRockit) installed, choosing a different one with different
            hotspot and garbage collector settings on every run of the
            test suite (which takes approx. 30-45 minutes).

            JDK 8 b79 works so far very well on Linux, we found some
            strange behavior in early versions (maybe compiler errors),
            but no longer at the moment. There is one configuration that
            constantly and reproducibly hangs in one module that is
            tested: The configuration uses JDK 8 b79 (same for b78), 32
            bit, and G1GC (server or client does not matter). The JVM
            running the tests hangs irresponsible (jstack or kill -3
            have no effect/cannot connect, standard kill does not stop
            it, only kill -9 actually kills it). It can be reproduced in
            this Lucene module 100% (it hangs always).

            I was able to connect with GDB to the JVM and get a stack
            trace on all threads (see attachment, dump.txt). As you see
            all threads of G1GC seem to hang in a syscall (os:park(), a
            conditional wait in pthread library). Unfortunately that’s
            all I can give you. A Java stacktrace is not possible
            because the JVM reacts on neither kill -3 nor jstack. With
            all other garbage collectors it passes the test without
            hangs in a few seconds, with 32 bit G1GC it can stand still
            for hours. The 64 bit JVM passes with G1GC, so only the 32
            bit variant is affected. Client or Server VM makes no
            difference.

            To reproduce:
            - Use a 32 bit JDK 8 b78 or b79 (tested on Linux 64 bit, but
            this should not matter)
            - Download Lucene Source code (e.g. the snapshot version we
            were testing with:
            
https://builds.apache.org/job/__Lucene-Artifacts-trunk/2212/__artifact/lucene/dist/
            
<https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/>)
            - change to directory lucene/analysis/uima and run:
                      ant -Dargs="-server -XX:+UseG1GC"
            -Dtests.multiplier=3 -Dtests.jvms=1 test
            After a while the test framework prints "stalled" messages
            (because the child VM actually running the test no longer
            responds). The PID is also printed. Try to get a stack trace
            or kill it, no response. Only kill -9 helps. Choosing
            another garbage collector in the above command line makes
            the test finish after a few seconds, e.g. -Dargs="-server
            -XX:+UseConcMarkSweepGC"

            I posted this bug report directly to the mailing list,
            because with earlier bug reports, there seem to be a problem
            with bugs.sun.com <http://bugs.sun.com> - there is no
            response from any reviewer after several weeks and we were
            able to help to find and fix javadoc and javac-compiler bugs
            early. So I hope you can help for this bug, too.

            Uwe

            -----
            Uwe Schindler
            uschind...@apache.org <mailto:uschind...@apache.org>
            Apache Lucene PMC Member / Committer
            Bremen, Germany
            http://lucene.apache.org/





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to