OOMEInReferenceHandler.java failing intermittently

David Holmes Mon, 08 Jul 2013 01:46:15 -0700

On 8/07/2013 5:49 PM, Peter Levart wrote:

On 07/08/2013 09:19 AM, David Holmes wrote:

In testing it though I simply exposed the remaining flaws in the
ReferenceHandler code. We can still kill the ReferenceHandler thread
with an OOME when it tries to load the Cleaner class (running with a
5M heap triggers this nicely if you use G1):


               // Fast path for cleaners
                if (r instanceof Cleaner) {
                    ((Cleaner)r).clean();
                    continue;
                }

and if that passes the clean() might throw OOME (it internally tries
to do a System.exit if an exception occurs but will likely encounter
another OOME trying to create the PrivilegedAction).

Even the:

               ReferenceQueue q = r.queue;
                if (q != ReferenceQueue.NULL) q.enqueue(r);

might throw OOME because enqueue() might have to load the
FinalReference class.

So really catching the OOME around the wait() only patches a small
hole. We can't simply put a try/catch in the for(;;) loop because that
doesn't address the problem that if the class loading throws OOME then
subsequent attempts to load that class will also fail. We would have
to preload all possible classes. Even then we might just send the
ReferenceHandler thread into a busy loop of throwing OOME catching it
and retrying!

So I can fix the test to deal with the Xcomp issue but we may still
see intermittent failures, and the ReferenceHandler thread may still
die from OOME.


Hi David,

I think the test is fine now.

Regarding the resilience of ReferenceHandler thread, I think we should
try to preload FinalReference, Cleaner and InterruptedException in the
initialization phase of ReferenceHandler thread. There should be no
danger of OOME in related cases then.


We would need more than that as we need to ensure there is zero
classloading needed at any point where OOME might be thrown. That
would include PrivilegedAction and any transitive closure therefrom.


Hi David,

I currently don't see any other possibility for another class to be
loaded in the ReferenceHandler's run() method, apart from those that you
indicated and what gets loaded in the the Cleaner.thunk's run() method.

Under "normal" conditions all the classes should already be loaded onceVM initialization is complete. It is only with very small heaps that wecan get the problem during VM initialization itself ... in which casemaybe we don't care? Or perhaps that too should be a fatal error? (If aCleaner has been registered then Cleaner must already be loaded.)Anything loaded via the run() method will be handled the enclosing codein Cleaner.

As far as Cleaner's exception
handler is concerned, I think it is not universally wise to just exit
the VM when any exception is thrown by the Cleaner's thunk.run() method.
What if that exception is OOME? That does not mean there's something
wrong with Cleaner.thunk's code. Not only will this kill
ReferenceHandler thread, but entire VM.


Yes "fatal errors" are not uncommon in the JDK, or at least not
completely rare. The general question is whether continuing might do
more harm than good given that some "critical" action has failed. Hard
to know in general so aborting is fairly common response (ala hotspot
running out of C heap memory).


Ok then. if we should stick with terminating the VM, what about the
following "allocation-conservative" Cleaner.clean() method:


Adding a try/finally in the PrivilegedAction.run() would seem to suffice.

David
-----


     private static final class TerminateAction implements
PrivilegedAction<Void> {
         final Error error = new Error("Cleaner terminated abnormally");
         @Override
         public Void run() {
             try {
                 if (System.err != null) error.printStackTrace();
             } finally {
                 System.exit(1);
             }
             return null;
         }
     }

     // pre-allocate single instance of TerminateAction
     private static final TerminateAction terminateAction = new
TerminateAction();

     /**
      * Runs this cleaner, if it has not been run before.
      */
     public void clean() {
         if (!remove(this))
             return;
         try {
             thunk.run();
         } catch (final Throwable x) {
             synchronized (terminateAction) {
                 if (terminateAction.error.getCause() == null)
                     terminateAction.error.initCause(x);
             }
             AccessController.doPrivileged(terminateAction);
         }
     }


Regards, Peter


Thanks,
David
-----

If the purpose of exiting VM was attracting attention to the possible
bug in Cleaner.thunk's code, then this absolutely works, but wouldn't
simple message to System.err be enough? Like for example:


     public void clean() {
         if (!remove(this))
             return;
         try {
             thunk.run();
         } catch (final Throwable x) {
             try {
                 new Error("Cleaner caught exception", x)
                     .printStackTrace();
             } catch (OutOfMemoryError oome) {
                 // can't do much
             }
         }
     }


Regards, Peter


David
-----


Regards, Peter

On 07/02/2013 06:38 AM, David Holmes wrote:

This recently added test was found to fail under some conditions -
namely client compiler with -Xcomp. It seems that the use of all
local
variables enabled the compiler to optimize things in a way that
stopped the weakref from being enqueued as expected. Simple fix
was to
make the weakref a field.

http://cr.openjdk.java.net/~dholmes/8016341/webrev/

Thanks,
David

Re: RFR: 8016341 java/lang/ref/OOMEInReferenceHandler.java failing intermittently

Reply via email to