On 20 Dec 2013, at 04:33, Mandy Chung <mandy.ch...@oracle.com> wrote:
> Hi Srikalyan, > > Maybe you can get add an uncaught handler to see if you can get > any information. +1. With this, at least the next time we see this failure we should have a better idea where the OOM is coming from. -Chris. > I ran it for 1000 times but not able to duplicate > the failure. Did you run it with jtreg (I didn't)? > > Below is the patch to install a thread's uncaught handler that > you can take and try. > > diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java > b/test/java/lang/ref/OOMEInReferenceHand > ler.java > --- a/test/java/lang/ref/OOMEInReferenceHandler.java > +++ b/test/java/lang/ref/OOMEInReferenceHandler.java > @@ -51,6 +51,14 @@ > return first; > } > > + static class UEH implements Thread.UncaughtExceptionHandler { > + public void uncaughtException(Thread t, Throwable e) { > + System.err.println("ERROR: " + t.getName() + " exception " + > + e.getMessage()); > + e.printStackTrace(); > + } > + } > + > public static void main(String[] args) throws Exception { > // preinitialize the InterruptedException class so that the > reference handler > // does not die due to OOME when loading the class if it is the > first use > @@ -77,6 +85,8 @@ > throw new IllegalStateException("Couldn't find Reference Handler > thread."); > } > > + referenceHandlerThread.setUncaughtExceptionHandler(new UEH()); > + > ReferenceQueue<Object> refQueue = new ReferenceQueue<>(); > Object referent = new Object(); > WeakReference<Object> weakRef = new WeakReference<>(referent, > refQueue); > > On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote: >> Hi David Thanks for your comments, the unguarded part(clean and enqueue) in >> the Reference Handler thread does not seem to create any new objects, so it >> is the application(the test in this case) which is adding objects to heap >> and causing the Reference Handler to die with OOME. I am still unsure about >> the side effects of the code change and agree with your thoughts(on memory >> exhaustion test's reliability). >> >> PS: hotspot dev alias removed from CC. >> >> -- >> Thanks >> kalyan >> >> On 12/19/13 5:08 PM, David Holmes wrote: >>> Hi Kalyan, >>> >>> This is not a hotspot issue so I'm moving this to core-libs, please drop >>> hotspot from any replies. >>> >>> On 20/12/2013 6:26 AM, srikalyan wrote: >>>> Hi all, I have been working on the bug JDK-8022321 >>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a sporadic >>>> failure and the webrev is available here >>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ >>>> >>> >>> I'm really not sure what to make of this. We have a test that triggers an >>> out-of-memory condition but the OOME can actually turn up in the >>> ReferenceHandler thread causing it to terminate and the test to fail. We >>> previously accounted for the non-obvious occurrences of OOME due to the >>> Object.wait and the possible need to load the InterruptedException class - >>> but still the OOME can appear where we don't want it. So finally you have >>> just placed the whole for(;;) loop in a try/catch(OOME) that ignores the >>> OOME. I'm certain that makes the test happy, but I'm not sure it is really >>> what we want for the ReferenceHandler thread. If the OOME occurs while >>> cleaning, or enqueuing then we will fail to clean and/or enqueue but there >>> would be no indication that has occurred and I think that is a bigger >>> problem than this test failing. >>> >>> There may be no way to make this test 100% reliable. In fact I'd suggest >>> that no memory exhaustion test can be 100% reliable. >>> >>> David >>> >>>> * >>>> **"Root Cause:Still not known"* >>>> 2 places where there is a possibility for OOME >>>> 1) Cleaner.clean() >>>> 2) ReferenceQueue.enqueue() >>>> >>>> 1) The cleanup code in turn has 2 places where there is potential for >>>> throwing OOME, >>>> a) thunk Thread which is run from clean() method. This Runnable is >>>> passed to Cleaner and appears in the following classes >>>> java/nio/DirectByteBuffer.java >>>> sun/misc/Perf.java >>>> sun/nio/fs/NativeBuffer.java >>>> sun/nio/ch/IOVecWrapper.java >>>> sun/misc/Cleaner/ExitOnThrow.java >>>> However none of the above overridden implementations ever create an >>>> object in the clean() code. >>>> b) new PrivilegedAction created in try catch Exception block of >>>> clean() method but for this object to be created and to be held >>>> responsible for OOME an Exception(other than OOME) has to be thrown. >>>> >>>> 2) No new heap objects are created in the enqueue method nor anywhere in >>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot be a >>>> potential cause. >>>> >>>> *Experimental change to java.lang.Reference.java* : >>>> - Put one more guard (try catch with OOME block) in the Reference >>>> Handler Thread which may give the Reference Handler a chance to cleanup. >>>> This is fixing the test failure (several 1000 runs with 0 failures) >>>> - Without the above change the test fails atleast 3-5 times for every >>>> 1000 run. >>>> >>>> *PS*: The code change is to a very critical part of JDK and i am fully >>>> not aware of the consequences of the change, hence seeking expert help >>>> here. Appreciate your time and inputs towards this. >>>> >> >