OOMEInReferenceHandler.java fails intermittently

Peter Levart Sat, 21 Dec 2013 08:51:20 -0800

Hi David,

Is it possible to get the test output when it fails? It can fail in twodifferent ways. I can't look at the bug (not authorized)...



On 12/20/2013 10:54 AM, Chris Hegarty wrote:

On 20 Dec 2013, at 04:33, Mandy Chung <mandy.ch...@oracle.com> wrote:

Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.

+1. With this, at least the next time we see this failure we should have a 
better idea where the OOM is coming from.

-Chris.

We can try, but I think the VM already prints the stack-trace of theexception by default and as far as I remember, OOME thrown by VM ispreallocated and does not contain a stack trace. So I suspect we'll seenothing more with the suggested UEH.

Is it possible to include in test, a modified version of Reference classthat would be prepended to boot-classpath? For example, containing thefollowing ReferenceHandler:



    private static class ReferenceHandler extends Thread {

        ReferenceHandler(ThreadGroup g, String name) {
            super(g, name);
        }

        private volatile int state;

        @Override
        public String toString() {
            return super.toString() + "[state=" + state + "]";
        }

        public void run() {
            for (;;) {
                state = 1;
                Reference<Object> r;
                state = 2;
                synchronized (lock) {
                    state = 3;
                    if (pending != null) {
                        state = 4;
                        r = pending;
                        state = 5;
                        pending = r.discovered;
                        state = 6;
                        r.discovered = null;
                        state = 7;
                    } else {
                        state = 8;

// The waiting on the lock may cause an OOMEbecause it may try to allocate// exception objects, so also catch OOME hereto avoid silent exit of the

                        // reference handler thread.
                        //

// Explicitly define the order of the twoexceptions we catch here

                        // when waiting for the lock.
                        //

// We do not want to try to potentially loadthe InterruptedException class// (which would be done if this was its firstuse, and InterruptedException

                        // were checked first) in this situation.
                        //

// This may lead to the VM not ever trying toload the InterruptedException

                        // class again.
                        try {
                            state = 9;
                            try {
                                state = 10;
                                lock.wait();
                                state = 11;

} catch (InterruptedException x) { state =12; }

                            state = 13;
                        } catch (OutOfMemoryError x) { state = 14; }
                        state = 15;
                        continue;
                    }
                    state = 16;
                }
                state = 17;

                // Fast path for cleaners
                if (r instanceof Cleaner) {
                    state = 18;
                    ((Cleaner)r).clean();
                    state = 19;
                    continue;
                }
                state = 20;

                ReferenceQueue<Object> q = (ReferenceQueue) r.queue;
                state = 21;
                if (q != ReferenceQueue.NULL) q.enqueue(r);
                state = 22;
            }
        }
    }

...then just include the toString of referenceHandlerThread instance aspart of the exception message at the end of the test:


...
...
         // wait at most 10 seconds for success or failure
         for (int i = 0; i < 20; i++) {
             if (refQueue.poll() != null) {
                 // Reference Handler thread still working -> success
                 return;
             }
             System.gc();

Thread.sleep(500L); // wait a little to allow GC to doit's work before allocating objects

             if (!referenceHandlerThread.isAlive()) {
                 // Reference Handler thread died -> failure

throw new Exception("Reference Handler thread died.referenceHandlerThread: " + referenceHandlerThread);

             }
         }

         // no sure answer after 10 seconds

throw new IllegalStateException("Reference Handler threadstuck. weakRef.get(): " + weakRef.get() +", referenceHandlerThread: "+ referenceHandlerThread);

This might be safer than using UEH since at the time theUEH.uncaughtException() is called, the heap might still be full whichwould prevent printing the message. The test makes sure the allocatedwaste gets GCed before reporting the outcome...

I suspect that with the above, the failure message would print 8 <=state <= 14 ...

When I was trying out the OOMEInReferenceHandler test, I experimentedwith various arrangements of exception handlers in ReferenceHandler andencountered one that in one ocassion allowed OOME to sneak-through orre-induced it, but I haven't been able to explain why and also could notreproduce it afterwards. So it might still be that somethinginteresting is happening between state 8 and 14.


Regards, Peter

I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
b/test/java/lang/ref/OOMEInReferenceHand
ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
          return first;
      }

+     static class UEH implements Thread.UncaughtExceptionHandler {
+         public void uncaughtException(Thread t, Throwable e) {
+             System.err.println("ERROR: " + t.getName() + " exception " +
+                 e.getMessage());
+             e.printStackTrace();
+         }
+     }
+
      public static void main(String[] args) throws Exception {
          // preinitialize the InterruptedException class so that the reference 
handler
          // does not die due to OOME when loading the class if it is the first 
use
@@ -77,6 +85,8 @@
              throw new IllegalStateException("Couldn't find Reference Handler 
thread.");
          }

+         referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
          ReferenceQueue<Object> refQueue = new ReferenceQueue<>();
          Object referent = new Object();
          WeakReference<Object> weakRef = new WeakReference<>(referent, 
refQueue);

On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:

Hi David Thanks for your comments, the unguarded part(clean and enqueue) in the 
Reference Handler thread does not seem to create any new objects, so it is the 
application(the test in this case) which is adding objects to heap and causing 
the Reference Handler to die with OOME. I am still unsure about the side 
effects of the code change and agree with your thoughts(on memory exhaustion 
test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please drop 
hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
<https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/

I'm really not sure what to make of this. We have a test that triggers an 
out-of-memory condition but the OOME can actually turn up in the 
ReferenceHandler thread causing it to terminate and the test to fail. We 
previously accounted for the non-obvious occurrences of OOME due to the 
Object.wait and the possible need to load the InterruptedException class - but 
still the OOME can appear where we don't want it. So finally you have just 
placed the whole for(;;) loop in a try/catch(OOME) that ignores the OOME. I'm 
certain that makes the test happy, but I'm not sure it is really what we want 
for the ReferenceHandler thread. If the OOME occurs while cleaning, or 
enqueuing then we will fail to clean and/or enqueue but there would be no 
indication that has occurred and I think that is a bigger problem than this 
test failing.

There may be no way to make this test 100% reliable. In fact I'd suggest that 
no memory exhaustion test can be 100% reliable.

David

*
**"Root Cause:Still not known"*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
     a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
         java/nio/DirectByteBuffer.java
         sun/misc/Perf.java
         sun/nio/fs/NativeBuffer.java
         sun/nio/ch/IOVecWrapper.java
         sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
     b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

Reply via email to