Hi Uwe,

Thanks. In the meantime I'm going to ask within oracle if anyone has the magic formula for the proxy settings.

There might able be another test case I can try. IIR we had another application that overflowed during reference processing (if I specified a mark stack size of 16K). I'm going to try that.

I'm fairly certain I have the fix - just want to verify it. I klnow you offered but I don't think we can send out under the table binaries though we can provide patches.

JohnC

On 3/6/2013 10:44 PM, Uwe Schindler wrote:

Hi John,

I only have time to work on a setup this evening Germen time, because I am on a business trip today. Will come back to you. Unfortunately I failed to quickly setup an easy classpath without Ivy downloading the JARS.

Uwe

-----

Uwe Schindler

uschind...@apache.org

Apache Lucene PMC Member / Committer

Bremen, Germany

http://lucene.apache.org/

*From:*John Cuthbertson [mailto:john.cuthbert...@oracle.com]
*Sent:* Thursday, March 07, 2013 12:49 AM
*To:* Uwe Schindler
*Cc:* 'Bengt Rutisson'; hotspot-gc-...@openjdk.java.net; dev@lucene.apache.org *Subject:* Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)

Hi Uwe,

An update:

I have downloaded ant and the lucerne source.

I attempted the ivy-bootstrap but it failed to download the ivy=2.3.0.jar file - even after setting:

ANT_OPTS=-Dhttp.proxyHost=<...> -Dhttp.proxyPort=<...>

So I manually downloaded and placed it into the ANT library and now get:


ivy-bootstrap1:
    [mkdir] Skipping /home/jcuthber/.ant/lib because it already exists.
     [echo] installing ivy 2.3.0 to /home/jcuthber/.ant/lib
[get] Getting: http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar
      [get] To: /home/jcuthber/.ant/lib/ivy-2.3.0.jar
[get] Error getting http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar to /home/jcuthber/.ant/lib/ivy-2.3.0.jar
[available] Found: /home/jcuthber/.ant/lib/ivy-2.3.0.jar

ivy-bootstrap2:
Skipped because property 'ivy.bootstrap1.success' set.

ivy-checksum:

ivy-bootstrap:

BUILD SUCCESSFUL
Total time: 3 minutes 46 seconds

Presumably I have to build the lucerne source before executing the tests. That seemed to go OK.

When I run the analysis/uima tests it seems to get hung up at the "resolve" target - even without specifying G1:


cairnapple{jcuthber}:408> cd analysis/uima/
cairnapple{jcuthber}:409> ls -l
total 29
-rw-r--r--   1 jcuthber staff       1473 Dec 10 10:39 build.xml
-rw-rw-r--   1 jcuthber staff       6895 Mar  6 15:20 hotspot.log
-rw-r--r--   1 jcuthber staff       1316 Mar 30  2012 ivy.xml
drwxr-xr-x   2 jcuthber staff          2 Mar  5 07:42 lib/
drwxr-xr-x   6 jcuthber staff          6 Mar  5 07:42 src/



ivy-configure:
[ivy:configure] Loading jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivy.properties <jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivy.properties> [ivy:configure] :: Apache Ivy 2.3.0 - 20130110142753 :: http://ant.apache.org/ivy/ :: [ivy:configure] jakarta commons httpclient not found: using jdk url handling [ivy:configure] :: loading settings :: file = /export/bugs/8009536/lucene-5.0-2013-03-05_15-37-06/ivy-settings.xml [ivy:configure] no default ivy user dir defined: set to /home/jcuthber/.ivy2 [ivy:configure] including url: jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-public.xml <jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-public.xml> [ivy:configure] no default cache defined: set to /home/jcuthber/.ivy2/cache [ivy:configure] including url: jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-shared.xml <jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-shared.xml> [ivy:configure] including url: jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-local.xml <jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-local.xml> [ivy:configure] including url: jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar!/org/apache/ivy/core/settings/ivysettings-main-chain.xml <jar:file:/home/jcuthber/.ant/lib/ivy-2.3.0.jar%21/org/apache/ivy/core/settings/ivysettings-main-chain.xml>
[ivy:configure] settings loaded (289ms)
[ivy:configure]         default cache: /home/jcuthber/.ivy2/cache
[ivy:configure]         default resolver: default
[ivy:configure]         -- 7 resolvers:
[ivy:configure]         working-chinese-mirror [ibiblio]
[ivy:configure]         main [chain] [shared, public]
[ivy:configure]         local [file]
[ivy:configure]         shared [file]
[ivy:configure]         sonatype-releases [ibiblio]
[ivy:configure]         public [ibiblio]
[ivy:configure] default [chain] [local, main, sonatype-releases, working-chinese-mirror]

resolve:
[ivy:retrieve] no resolved descriptor found: launching default resolve
Overriding previous definition of property "ivy.version"
[ivy:retrieve] using ivy parser to parse file:/export/bugs/8009536/lucene-5.0-2013-03-05_15-37-06/analysis/uima/ivy.xml <file:///%5C%5Cexport%5Cbugs%5C8009536%5Clucene-5.0-2013-03-05_15-37-06%5Canalysis%5Cuima%5Civy.xml> [ivy:retrieve] :: resolving dependencies :: org.apache.lucene#analyzers-uima;working@cairnapple
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  validate = true
[ivy:retrieve]  refresh = false
[ivy:retrieve] resolving dependencies for configuration 'default'
[ivy:retrieve] == resolving dependencies for org.apache.lucene#analyzers-uima;working@cairnapple [default] [ivy:retrieve] == resolving dependencies org.apache.lucene#analyzers-uima;working@cairnapple->org.apache.uima#Tagger;2.3.1 [default->*] [ivy:retrieve] default: Checking cache for: dependency: org.apache.uima#Tagger;2.3.1 {*=[*]} [ivy:retrieve] don't use cache for org.apache.uima#Tagger;2.3.1: checkModified=true [ivy:retrieve] tried /home/jcuthber/.ivy2/local/org.apache.uima/Tagger/2.3.1/ivys/ivy.xml [ivy:retrieve] tried /home/jcuthber/.ivy2/local/org.apache.uima/Tagger/2.3.1/jars/Tagger.jar [ivy:retrieve] local: no ivy file nor artifact found for org.apache.uima#Tagger;2.3.1 [ivy:retrieve] main: Checking cache for: dependency: org.apache.uima#Tagger;2.3.1 {*=[*]} [ivy:retrieve] tried /home/jcuthber/.ivy2/shared/org.apache.uima/Tagger/2.3.1/ivys/ivy.xml [ivy:retrieve] tried /home/jcuthber/.ivy2/shared/org.apache.uima/Tagger/2.3.1/jars/Tagger.jar [ivy:retrieve] shared: no ivy file nor artifact found for org.apache.uima#Tagger;2.3.1 [ivy:retrieve] tried http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom

and there it hangs - presumably trying to access http://repo1.maven.org/maven2/org/apache/uima/Tagger/2.3.1/Tagger-2.3.1.pom

There must be something with our proxy settings that that won't allow this.

JohnC


On 03/06/13 11:15, Uwe Schindler wrote:

Hi,
That's unfortunately not so easy, because of project dependencies. To run the test you have to compile Lucene Core then the specific module + the test framework (which is special for Lucene) and download some JARs from Maven central (JAR hell, as usual).
If you give me some time, I would collect all needed JAR files from my local 
checkout and provide you the correct cmd line + a ZIP file with maybe a shell 
script to startup. It should be doable, but needs some work to collect all 
dependencies for the classpath.
If you want to do it quicker (should be quite fast to do):
- Download ANT 1.8.2 binary zip (unfortunately ANT 1.8.4 has a bug making it 
not working out of the box with Java 
8):http://archive.apache.org/dist/ant/binaries/apache-ant-1.8.2-bin.tar.gz  - I 
just wonder about the fact: isn't ANT needed to build the JDK classlib by 
itself? I remember that the FreeBSD OpenJDK build downloads ANT and does a 
large part of the compilation using ANT...
- put the ANT bin/ dir into your PATH
- download the Apache Lucene source code from 
Jenkins:https://builds.apache.org/job/Lucene-Artifacts-trunk/2212/artifact/lucene/dist/lucene-5.0-2013-03-05_15-37-06-src.tgz
- go to extracted lucene source dir, call "ant ivy-bootstrap" (this will 
download Apache IVY, so all dependencies can be downloaded from Maven Central)
- change to the module that fails: # cd analysis/uima
- execute: # ant -Dargs="-server -XX:+UseG1GC" -Dtests.multiplier=3 
-Dtests.jvms=1 test
- In a parallel console you might be able to attach to the process, the build 
in the main console using ANT runs inside ANT and the test framework spawns 
separate worker instances of the JVM to execute the tests. This makes it hard 
to reproduce in standalone (the command line passed to the child JVM is 
veeeeery long).
I will work on putting together a precompiled ZIP file with all needed JARs + the command line. Just tell me if you got it managed with the above howto, then I don’t need to do this.
Uwe
-----
Uwe Schindler
uschind...@apache.org <mailto:uschind...@apache.org> Apache Lucene PMC Member / Committer
Bremen, Germany
http://lucene.apache.org/
    -----Original Message-----

    From: John Cuthbertson [mailto:john.cuthbert...@oracle.com]

    Sent: Wednesday, March 06, 2013 7:51 PM

    To: Uwe Schindler

    Cc: 'Bengt Rutisson';hotspot-gc-...@openjdk.java.net  
<mailto:hotspot-gc-...@openjdk.java.net>;

    dev@lucene.apache.org  <mailto:dev@lucene.apache.org>

    Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)

    Hi Uwe,

    I've downloaded  lucene-5.0-2013-03-05_15-37-06.zip from

    https://builds.apache.org/job/Lucene-Artifacts-

    trunk/2212/artifact/lucene/dist/

    I don't have ant on my workstation so do you have a java command line to

    run the test(s) that generate the error?

    Thanks,

    JohnC

    On 3/6/2013 3:16 AM, Uwe Schindler wrote:

        Hi,

            I think this is a VM bug and the thread dumps that Uwe produced are

            enough to start tracking down the root cause.

        I hope it is enough! If I can help with more details, tell me what I 
should do

    to track this down. Unfortunately, we have no isolated test case (like a 
small

    java class that triggers this bug) - you have to run the test cases of this

    Lucene's module. It only happens there, not in any other Lucene test suite. 
It

    may be caused by a lot of GC activity in this "UIMA" module or a specific 
test.

            On 3/6/13 8:52 AM, David Holmes wrote:

                If the VM is completely unresponsive then it suggests we are at 
a

                safepoint.

            Yes, we are hanging during a stop-the-world GC, so we are at a 
safepoint.

                The GC threads are not "hung" in os::parK, they are parked - 
waiting

                to be notified of something.

            It looks like the reference processing thread is stuck in a loop

            where it does wait(). So, the VM is hanging even if that stack trace

            also ends up in os::park().

                The thing is to find out why they are not being woken up.

            Actually, in this case we should probably not even be calling 
wait...

                Can the gdb log be posted somewhere? I don't know if the 
attachment

                made it to the original posting on hotspot-gc but it's no longer

                available on hotspot-dev.

            I received the attachment with the original email. I've attached it

            to the bug report that I created: 8009536. You can find it there if

            you want to. But I think we have a fairly good idea of what change

            caused the hang.

        If it helps: Unfortunately, we had some problems with recent JDK builds,

    because javac and javadoc tools were not working correctly, failing to build

    our source code. Since b78 this was fixed. Until this was fixed, we used 
build

    b65 (which was the last one working) and the G1GC hangs did not appear on

    this version. So it must have happened by a change after b65 till b78.

        Uwe

            Bengt

                Thanks,

                David

                On 6/03/2013 4:07 PM, Krystal Mok wrote:

                    Hi Uwe,

                    If you can attach gdb onto it, and jstack -m and jstack -F 
should

                    also work; that'll get you the Java stack trace.

                    (But it probably doesn't matter in this case, because the 
hang is

                    probably bug in the VM).

                    - Kris

                    On Wed, Mar 6, 2013 at 5:48 AM, Uwe Schindler

            <uschind...@apache.org>  <mailto:uschind...@apache.org>

                    wrote:

                        Hi,

                        since a few month we are extensively testing various 
preview

                        builds of JDK 8 for compatibility with Apache Lucene 
and Solr, so

                        we can find any bugs early and prevent the problems we 
had with

                        the release of Java 7 two years ago. Currently we have 
a Linux

                        (Ubuntu 64bit) Jenkins machine that has various JDKs 
(JDK 6, JDK

                        7, JDK 8 snapshot, IBM J9, older JRockit) installed, 
choosing a

                        different one with different hotspot and garbage 
collector

                        settings on every run of the test suite (which takes 
approx. 30-45

    minutes).

                        JDK 8 b79 works so far very well on Linux, we found 
some strange

                        behavior in early versions (maybe compiler errors), but 
no longer

                        at the moment. There is one configuration that 
constantly and

                        reproducibly hangs in one module that is tested: The 
configuration

                        uses JDK 8 b79 (same for b78), 32 bit, and G1GC (server 
or client

                        does not matter). The JVM running the tests hangs 
irresponsible

                        (jstack or kill -3 have no effect/cannot connect, 
standard kill

                        does not stop it, only kill -9 actually kills it). It 
can be

                        reproduced in this Lucene module 100% (it hangs always).

                        I was able to connect with GDB to the JVM and get a 
stack trace on

                        all threads (see attachment, dump.txt). As you see all 
threads of

                        G1GC seem to hang in a syscall (os:park(), a 
conditional wait in

                        pthread library). Unfortunately that’s all I can give 
you. A Java

                        stacktrace is not possible because the JVM reacts on 
neither kill

                        -3 nor jstack. With all other garbage collectors it 
passes the

                        test without hangs in a few seconds, with 32 bit G1GC 
it can stand

                        still for hours. The 64 bit JVM passes with G1GC, so 
only the 32

                        bit variant is affected. Client or Server VM makes no 
difference.

                        To reproduce:

                        - Use a 32 bit JDK 8 b78 or b79 (tested on Linux 64 
bit, but this

                        should not matter)

                        - Download Lucene Source code (e.g. the snapshot 
version we were

                        testing with:

                        https://builds.apache.org/job/Lucene-Artifacts-

            trunk/2212/artifact/lucene/dist/)

                        - change to directory lucene/analysis/uima and run:

                                   ant -Dargs="-server -XX:+UseG1GC" 
-Dtests.multiplier=3

                        -Dtests.jvms=1 test

                        After a while the test framework prints "stalled" 
messages

                        (because the child VM actually running the test no 
longer

                        responds). The PID is also printed. Try to get a stack 
trace or kill it, no

    response.

                        Only kill -9 helps. Choosing another garbage collector 
in the

                        above command line makes the test finish after a few 
seconds, e.g.

                        -Dargs="-server -XX:+UseConcMarkSweepGC"

                        I posted this bug report directly to the mailing list, 
because

                        with earlier bug reports, there seem to be a problem 
with

                        bugs.sun.com - there is no response from any reviewer 
after

                        several weeks and we were able to help to find and fix 
javadoc and

                        javac-compiler bugs early. So I hope you can help for 
this bug, too.

                        Uwe

                        -----

                        Uwe Schindler

                        uschind...@apache.org  <mailto:uschind...@apache.org>

                        Apache Lucene PMC Member / Committer Bremen, Germany

                        http://lucene.apache.org/


Reply via email to