Kal, Can you give access to Peter to the machine where you ran this test. Please send the details to him privately.
Thanks, Sandeep On Jan 8, 2014, at 12:08 PM, srikalyan chandrashekar <srikalyan.chandrashe...@oracle.com> wrote: > Hi Peter, the jtreg test configuration is @run main/othervm -Xmx24M > -XX:-UseTLAB OOMEInReferenceHandler. With this option you still have to run > the test several times(like a 1000 runs) to capture 1(OR) more failures. > Platform may not have an affect, however i used a 64 bit Ubuntu 12.04 LTS , > 8GB, 2 core workstation and any JDK(7/8). > > --- > Thanks > kalyan > > On 01/08/2014 05:53 AM, Peter Levart wrote: >> Hi Kalyan, >> >> What hardware/OS/JVM and what JVM options are you using to reproduce this >> failure. I would really like to reproduce this myself, but all attempts on >> my PC have so far been unsuccessful. I might be able to get access to a >> machine that is similar to yours... >> >> Regards, Peter >> >> On 01/07/2014 09:55 PM, srikalyan chandrashekar wrote: >>> Peter, getting state info out(to console or otherwise) from within >>> Reference Handler's exceptions handlers have been unsuccessful. However >>> David's suggestion produced some useful trace with fast debug build and >>> could get some information , see the log here >>> <http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log> . >>> --- >>> Thanks >>> kalyan >>> On 01/07/2014 12:42 AM, Peter Levart wrote: >>>> On 01/07/2014 03:15 AM, srikalyan chandrashekar wrote: >>>>> Sure David will give that a try, we have so far attempted to >>>>> 1. Print state data(as per the test creator peter.levart's inputs), >>>> >>>> Hi Kalyan, >>>> >>>> Have you been able to reproduce the OOME in that set-up? What was the >>>> result? >>>> >>>> Regards, Peter >>>> >>>>> 2. Use UEH(uncaught exception handler per Mandy's inputs) >>>>> >>>>> -- >>>>> Thanks >>>>> kalyan >>>>> >>>>> On 1/6/14 4:40 PM, David Holmes wrote: >>>>>> Back from vacation ... >>>>>> >>>>>> On 20/12/2013 4:49 PM, David Holmes wrote: >>>>>>> On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote: >>>>>>>> Hi David Thanks for your comments, the unguarded part(clean and >>>>>>>> enqueue) >>>>>>>> in the Reference Handler thread does not seem to create any new >>>>>>>> objects, >>>>>>>> so it is the application(the test in this case) which is adding objects >>>>>>>> to heap and causing the Reference Handler to die with OOME. >>>>>>> >>>>>>> The ReferenceHandler thread can only get OOME if it allocates (directly >>>>>>> or indirectly) - so there has to be something in the unguarded part that >>>>>>> causes this. Again it may be an implicit action in the VM - similar to >>>>>>> the class load issue for InterruptedException. >>>>>> >>>>>> Run a debug VM with -XX:+TraceExceptions to see where the OOME is >>>>>> triggered. >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> David >>>>>>> >>>>>>> I am still >>>>>>>> unsure about the side effects of the code change and agree with your >>>>>>>> thoughts(on memory exhaustion test's reliability). >>>>>>>> >>>>>>>> PS: hotspot dev alias removed from CC. >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks >>>>>>>> kalyan >>>>>>>> >>>>>>>> On 12/19/13 5:08 PM, David Holmes wrote: >>>>>>>>> Hi Kalyan, >>>>>>>>> >>>>>>>>> This is not a hotspot issue so I'm moving this to core-libs, please >>>>>>>>> drop hotspot from any replies. >>>>>>>>> >>>>>>>>> On 20/12/2013 6:26 AM, srikalyan wrote: >>>>>>>>>> Hi all, I have been working on the bug JDK-8022321 >>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8022321> , this is a >>>>>>>>>> sporadic >>>>>>>>>> failure and the webrev is available here >>>>>>>>>> http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> I'm really not sure what to make of this. We have a test that triggers >>>>>>>>> an out-of-memory condition but the OOME can actually turn up in the >>>>>>>>> ReferenceHandler thread causing it to terminate and the test to fail. >>>>>>>>> We previously accounted for the non-obvious occurrences of OOME due to >>>>>>>>> the Object.wait and the possible need to load the InterruptedException >>>>>>>>> class - but still the OOME can appear where we don't want it. So >>>>>>>>> finally you have just placed the whole for(;;) loop in a >>>>>>>>> try/catch(OOME) that ignores the OOME. I'm certain that makes the test >>>>>>>>> happy, but I'm not sure it is really what we want for the >>>>>>>>> ReferenceHandler thread. If the OOME occurs while cleaning, or >>>>>>>>> enqueuing then we will fail to clean and/or enqueue but there would be >>>>>>>>> no indication that has occurred and I think that is a bigger problem >>>>>>>>> than this test failing. >>>>>>>>> >>>>>>>>> There may be no way to make this test 100% reliable. In fact I'd >>>>>>>>> suggest that no memory exhaustion test can be 100% reliable. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>>> * >>>>>>>>>> **"Root Cause:Still not known"* >>>>>>>>>> 2 places where there is a possibility for OOME >>>>>>>>>> 1) Cleaner.clean() >>>>>>>>>> 2) ReferenceQueue.enqueue() >>>>>>>>>> >>>>>>>>>> 1) The cleanup code in turn has 2 places where there is potential >>>>>>>>>> for >>>>>>>>>> throwing OOME, >>>>>>>>>> a) thunk Thread which is run from clean() method. This Runnable >>>>>>>>>> is >>>>>>>>>> passed to Cleaner and appears in the following classes >>>>>>>>>> java/nio/DirectByteBuffer.java >>>>>>>>>> sun/misc/Perf.java >>>>>>>>>> sun/nio/fs/NativeBuffer.java >>>>>>>>>> sun/nio/ch/IOVecWrapper.java >>>>>>>>>> sun/misc/Cleaner/ExitOnThrow.java >>>>>>>>>> However none of the above overridden implementations ever create an >>>>>>>>>> object in the clean() code. >>>>>>>>>> b) new PrivilegedAction created in try catch Exception block of >>>>>>>>>> clean() method but for this object to be created and to be held >>>>>>>>>> responsible for OOME an Exception(other than OOME) has to be thrown. >>>>>>>>>> >>>>>>>>>> 2) No new heap objects are created in the enqueue method nor >>>>>>>>>> anywhere in >>>>>>>>>> the deep call stack (VM.addFinalRefCount() etc) so this cannot be a >>>>>>>>>> potential cause. >>>>>>>>>> >>>>>>>>>> *Experimental change to java.lang.Reference.java* : >>>>>>>>>> - Put one more guard (try catch with OOME block) in the Reference >>>>>>>>>> Handler Thread which may give the Reference Handler a chance to >>>>>>>>>> cleanup. >>>>>>>>>> This is fixing the test failure (several 1000 runs with 0 failures) >>>>>>>>>> - Without the above change the test fails atleast 3-5 times for every >>>>>>>>>> 1000 run. >>>>>>>>>> >>>>>>>>>> *PS*: The code change is to a very critical part of JDK and i am >>>>>>>>>> fully >>>>>>>>>> not aware of the consequences of the change, hence seeking expert >>>>>>>>>> help >>>>>>>>>> here. Appreciate your time and inputs towards this. >>>>>>>>>> >>>>>>>> >>>>> >>>> >>> >> >