subject:"Re\\\: Analysis on JDK\\\-8022321 java\\\/lang\\\/ref\\\/OOMEInReferenceHandler.java fails intermittently"




On 03/23/2016 09:40 PM, Kim Barrett wrote:

On Mar 23, 2016, at 3:33 PM, Peter Levart  wrote:

Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart 
  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits already maintains the amount 
of currently allocated direct memory (in bytes). What your suggestion leads to is similar to one of 
previous versions of java.nio.Bits which waited for some 'timeout' time after invoking System.gc() 
and then re-tried reservation, failing if it didn't succeed. The problem with such 
"asynchronous" approach is that there's no right value of 'timeout' for all situations. 
If you wait for to short time, you might get OOME although there are plenty unreachable but still 
uncleaned direct buffers. If you wait for to long, your throughput will suffer. There has to be 
some "feedback" from reference processing to know when there's still beneficial to wait 
and when there's no point in waiting any more.

Regards, Peter

I don't think there's any throughput penalty for a long timeout.  The
proper response to waitForCleanups returning false (assuming the epoch
was obtained early and passed as an argument) is OOME.  I really doubt
the latency for reporting OOME is of critical importance.

That is, the caller looks something like (not even pretending to write
Java)

   alloc = tryAllocatation(allocSize)
   if alloc != NULL
 return alloc
   endif
   // Maybe add a retry+wait with a short timeout here,
   // to allow existing cleanups to run before requesting
   // another gc.  Not clear that's really worthwhile, as
   // it only comes up when we get here just after a gc
   // and the resulting cleanups are not yet all processed.
   System.gc()
   while true
 epoch = getEpoch()
 alloc = tryAllocation(allocSize)
 if alloc != NULL
   return alloc
 elif !waitForCleanup(epoch)
   throw OOME  // No cleanup progress for a while
 endif
   end



Right, this is easier to understand. I already figured out what you 
wanted to say the 1st time. I'll try to prepare a prototype along this 
idea tomorrow.


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


Hi Kim,

Thinking more about your approach. Basically your idea is to detect that 
there are no more unprocessed but pending or enqueued Cleanables by 
timing out on waiting for next Cleanable to be processed. In that case 
the timeout should be reset when each Cleanable is detected to be 
processed so that when there's a "silence" detected for at least the 
whole timeout period, we can claim with enough probability that there 
are no more unprocessed Cleanables either pending or enqueued and that 
we can give up with OOME.


Let me try to see with a prototype if this approach leads to success...

Regards, Peter

On 03/23/2016 08:33 PM, Peter Levart wrote:

Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits 
already maintains the amount of currently allocated direct memory (in 
bytes). What your suggestion leads to is similar to one of previous 
versions of java.nio.Bits which waited for some 'timeout' time after 
invoking System.gc() and then re-tried reservation, failing if it 
didn't succeed. The problem with such "asynchronous" approach is that 
there's no right value of 'timeout' for all situations. If you wait 
for to short time, you might get OOME although there are plenty 
unreachable but still uncleaned direct buffers. If you wait for to 
long, your throughput will suffer. There has to be some "feedback" 
from reference processing to know when there's still beneficial to 
wait and when there's no point in waiting any more.


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Kim Barrett

> On Mar 23, 2016, at 3:33 PM, Peter Levart  wrote:
> 
> Hi Kim,
> 
> On 03/23/2016 07:55 PM, Kim Barrett wrote:
>>> On Mar 23, 2016, at 10:02 AM, Peter Levart 
>>>  wrote:
>>> ...so I checked what it would be needed if there was such 
>>> getPendingReferences() native method. It turns out that a single native 
>>> method would not be enough to support the precise direct ByteBuffer 
>>> allocation. Here's a refactored webrev that introduces a 
>>> getPendingReferences() method which could be turned into a native 
>>> equivalent one day. There is another native method needed - int 
>>> awaitEnqueuePhaseStart():
>>> 
>>> 
>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/
>> I don't think the Reference.awaitEnqueuePhaseStart thing is needed.
>> 
>> Rather, I think the Direct-X-Buffer allocation should conspire with
>> the the Direct-X-Buffer cleanups directly to manage that sort of
>> thing, and not add anything to Reference and the reference processing
>> thread.  E.g. the phase and signal/wait are purely part of
>> Direct-X-Buffer.  (I also think something like that could/should have
>> been done instead of providing Direct-X-Buffer with access to
>> Reference.tryHandlePending, but that's likely water under the bridge
>> now.)
>> 
>> Something very roughly like this:
>> 
>> allocating thread, after allocation failed
>> 
>> bool waitForCleanups() {
>>   int epoch = DXB.getCleanupCounter();
>>   long start = startTime();
>>   long timeout = calcTimeout(start)
>>   synchronized (DXB.getCleanupMonitor()) {
>> while (epoch == DBX.getCleanupCounter()) {
>>   wait(timeout);
>>   timeout = calcTimeout(start);
>>   if (timeout <= 0) break;
>> }
>> return epoch != DBX.getCleanupCounter();
>>   }
>> }
>> 
>> cleanup function, after freeing memory
>> 
>>   synchronized (DBX.getCleanupMonitor()) {
>> DBX.incCleanupCounter();
>> DBX.getCleanupMonitor().notify_all();
>>   }
>> 
>> Actually, epoch should probably have been obtained *before* the failed
>> allocation attempt, and should be an argument to waitForCleanups.
>> 
>> That's all quite sketchy, but I need to do other things today.
>> 
>> Peter, care to try filling this in?
>> 
>> 
> 
> There's no need to maintain a special cleanup counter as java.nio.Bits 
> already maintains the amount of currently allocated direct memory (in bytes). 
> What your suggestion leads to is similar to one of previous versions of 
> java.nio.Bits which waited for some 'timeout' time after invoking System.gc() 
> and then re-tried reservation, failing if it didn't succeed. The problem with 
> such "asynchronous" approach is that there's no right value of 'timeout' for 
> all situations. If you wait for to short time, you might get OOME although 
> there are plenty unreachable but still uncleaned direct buffers. If you wait 
> for to long, your throughput will suffer. There has to be some "feedback" 
> from reference processing to know when there's still beneficial to wait and 
> when there's no point in waiting any more.
> 
> Regards, Peter

I don't think there's any throughput penalty for a long timeout.  The
proper response to waitForCleanups returning false (assuming the epoch
was obtained early and passed as an argument) is OOME.  I really doubt
the latency for reporting OOME is of critical importance.

That is, the caller looks something like (not even pretending to write
Java) 

  alloc = tryAllocatation(allocSize)
  if alloc != NULL
return alloc
  endif
  // Maybe add a retry+wait with a short timeout here,
  // to allow existing cleanups to run before requesting
  // another gc.  Not clear that's really worthwhile, as
  // it only comes up when we get here just after a gc
  // and the resulting cleanups are not yet all processed.
  System.gc()
  while true
epoch = getEpoch() 
alloc = tryAllocation(allocSize)
if alloc != NULL
  return alloc
elif !waitForCleanup(epoch)
  throw OOME  // No cleanup progress for a while
endif
  end

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits 
already maintains the amount of currently allocated direct memory (in 
bytes). What your suggestion leads to is similar to one of previous 
versions of java.nio.Bits which waited for some 'timeout' time after 
invoking System.gc() and then re-tried reservation, failing if it didn't 
succeed. The problem with such "asynchronous" approach is that there's 
no right value of 'timeout' for all situations. If you wait for to short 
time, you might get OOME although there are plenty unreachable but still 
uncleaned direct buffers. If you wait for to long, your throughput will 
suffer. There has to be some "feedback" from reference processing to 
know when there's still beneficial to wait and when there's no point in 
waiting any more.


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Kim Barrett

> On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
> ...so I checked what it would be needed if there was such 
> getPendingReferences() native method. It turns out that a single native 
> method would not be enough to support the precise direct ByteBuffer 
> allocation. Here's a refactored webrev that introduces a 
> getPendingReferences() method which could be turned into a native equivalent 
> one day. There is another native method needed - int awaitEnqueuePhaseStart():
> 
> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
  int epoch = DXB.getCleanupCounter();
  long start = startTime();
  long timeout = calcTimeout(start)
  synchronized (DXB.getCleanupMonitor()) {
while (epoch == DBX.getCleanupCounter()) {
  wait(timeout);
  timeout = calcTimeout(start);
  if (timeout <= 0) break;
}
return epoch != DBX.getCleanupCounter();
  }
}

cleanup function, after freeing memory

  synchronized (DBX.getCleanupMonitor()) {
DBX.incCleanupCounter();
DBX.getCleanupMonitor().notify_all();
  }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Per Liden


Hi Peter,

On 2016-03-23 15:02, Peter Levart wrote:

Hi Per, Kim,

On 03/22/2016 10:24 AM, Per Liden wrote:

So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
}


...so I checked what it would be needed if there was such
getPendingReferences() native method. It turns out that a single native
method would not be enough to support the precise direct ByteBuffer
allocation. Here's a refactored webrev that introduces a
getPendingReferences() method which could be turned into a native
equivalent one day. There is another native method needed - int
awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/


The need for this additional method arises when one wants to combine
reference discovery with enqueueing of discovered references into one
synchronous operation (discoverAndEnqueueReferences()). A direct
ByteBuffer allocating thread wants to trigger reference discovery
(System.gc()) and wait for discovered references to be enqueued before
continuing with direct memory reservation retries. An alternative to
what I have done in above webrev would be a maintenance of a single
enqueuePhase counter on the Java side with usage roughly as:

discoverAndEnqueueReferences() {
 int phase = Reference.getEnqueuePhase();
 System.gc();
 Reference.awaitEnqueuePhaseGreaterThan(phase);
}

But in that case, System.gc() would have to guarantee that after
discovery of no new references, blocked getPendingReferences() would
still return with an empty list of References (null) just to keep the
DBB allocating thread alive. I have tried to do this variant and
unfortunately it can't be reliably performed with current protocol as
getPendingReferences() can only be programmed to return non-empty
Reference lists without ambiguity. I created a DirectBufferAllocOOMETest
to exercise situations where no new Reference(s) are discovered in a GC
round.

So do what do you think - what would it be easier to support:
a) getPendingReferences() returns empty Reference list (null) after a GC
round that discovers no new pending references
b) getPendingReferences() returns when new Reference(s) are discovered
and there is an additional int awaitEnqueuePhaseStart() as defined in
above webrev.


I've prototyped the VM side. I've ignored the "await" issue for now as I 
first just wanted the basic structure up. I'm running out of time for 
today (and I'll be away the rest of the week) but let's continue the 
discussion next week and figure out the "await" details/alternatives.


Webrevs for jdk9/hs-rt:

http://cr.openjdk.java.net/~pliden/reference_pending_list/webrev.0-jdk
http://cr.openjdk.java.net/~pliden/reference_pending_list/webrev.0-hotspot

It passes jdk/test/java/lang/ref/* and our VM tests for reference 
processing.


cheers,
Per

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


Hi Per, Kim,

On 03/22/2016 10:24 AM, Per Liden wrote:

So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
} 


...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native 
method would not be enough to support the precise direct ByteBuffer 
allocation. Here's a refactored webrev that introduces a 
getPendingReferences() method which could be turned into a native 
equivalent one day. There is another native method needed - int 
awaitEnqueuePhaseStart():


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

The need for this additional method arises when one wants to combine 
reference discovery with enqueueing of discovered references into one 
synchronous operation (discoverAndEnqueueReferences()). A direct 
ByteBuffer allocating thread wants to trigger reference discovery 
(System.gc()) and wait for discovered references to be enqueued before 
continuing with direct memory reservation retries. An alternative to 
what I have done in above webrev would be a maintenance of a single 
enqueuePhase counter on the Java side with usage roughly as:


discoverAndEnqueueReferences() {
int phase = Reference.getEnqueuePhase();
System.gc();
Reference.awaitEnqueuePhaseGreaterThan(phase);
}

But in that case, System.gc() would have to guarantee that after 
discovery of no new references, blocked getPendingReferences() would 
still return with an empty list of References (null) just to keep the 
DBB allocating thread alive. I have tried to do this variant and 
unfortunately it can't be reliably performed with current protocol as 
getPendingReferences() can only be programmed to return non-empty 
Reference lists without ambiguity. I created a DirectBufferAllocOOMETest 
to exercise situations where no new Reference(s) are discovered in a GC 
round.


So do what do you think - what would it be easier to support:
a) getPendingReferences() returns empty Reference list (null) after a GC 
round that discovers no new pending references
b) getPendingReferences() returns when new Reference(s) are discovered 
and there is an additional int awaitEnqueuePhaseStart() as defined in 
above webrev.


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Per Liden


Hi,

On 2016-03-23 08:13, Peter Levart wrote:



On 03/22/2016 10:28 PM, Kim Barrett wrote:

On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
One thing I like about this approach is that it's only the
ReferenceHandler thread that pops of elements from the pending list
and enqueues them. That simplifies things a lot.

I like that too.  And hopefully we really can get rid of
sun.misc.Cleaner (under whatever name).


 From a GC perspective I would however like to get away from the
shared pending list and the pending list lock entirety and instead
provide a VM downcall to get the pending list. The goal would of
course be to have a more robust way of transferring the pending list
to Java land, instead of today's secret handshake which is easy to
get wrong. Also, not requiring the pending list lock (which is a Java
monitor) to be held during a GC would also simplify things a lot on
the GC side. E.g. the ReferencePendingListLockerThread could be
removed completely.

I’ve been thinking along the same lines.  I think having the pending
list (and associated locking and notification) in Java is just making
life difficult for ourselves, and that things could be much simpler if
that whole protocol was owned by the VM.

Once the reference handler thread has obtained the latest list, if it
then wants to publish that list for other Java threads to help
process, that’s a policy choice that can be explored on the Java side,
with no impact on the VM (including the GC).



If the only blocking/waiting of ReferenceHandler thread was performed by
native code, could it simply ignore Java thread interrupts? If this is
possible, then the problems of InterruptedException allocation and
consequent OutOfMemoryError(s) just disappear.


Yes, blocking in the VM here would ignore thread interrupts and not 
throw InterruptedException.


cheers,
Per

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently




On 03/22/2016 10:28 PM, Kim Barrett wrote:

On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
One thing I like about this approach is that it's only the ReferenceHandler 
thread that pops of elements from the pending list and enqueues them. That 
simplifies things a lot.

I like that too.  And hopefully we really can get rid of sun.misc.Cleaner 
(under whatever name).


 From a GC perspective I would however like to get away from the shared pending 
list and the pending list lock entirety and instead provide a VM downcall to 
get the pending list. The goal would of course be to have a more robust way of 
transferring the pending list to Java land, instead of today's secret handshake 
which is easy to get wrong. Also, not requiring the pending list lock (which is 
a Java monitor) to be held during a GC would also simplify things a lot on the 
GC side. E.g. the ReferencePendingListLockerThread could be removed completely.

I’ve been thinking along the same lines.  I think having the pending list (and 
associated locking and notification) in Java is just making life difficult for 
ourselves, and that things could be much simpler if that whole protocol was 
owned by the VM.

Once the reference handler thread has obtained the latest list, if it then 
wants to publish that list for other Java threads to help process, that’s a 
policy choice that can be explored on the Java side, with no impact on the VM 
(including the GC).



If the only blocking/waiting of ReferenceHandler thread was performed by 
native code, could it simply ignore Java thread interrupts? If this is 
possible, then the problems of InterruptedException allocation and 
consequent OutOfMemoryError(s) just disappear.


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Kim Barrett

> On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
> One thing I like about this approach is that it's only the ReferenceHandler 
> thread that pops of elements from the pending list and enqueues them. That 
> simplifies things a lot.

I like that too.  And hopefully we really can get rid of sun.misc.Cleaner 
(under whatever name).

> From a GC perspective I would however like to get away from the shared 
> pending list and the pending list lock entirety and instead provide a VM 
> downcall to get the pending list. The goal would of course be to have a more 
> robust way of transferring the pending list to Java land, instead of today's 
> secret handshake which is easy to get wrong. Also, not requiring the pending 
> list lock (which is a Java monitor) to be held during a GC would also 
> simplify things a lot on the GC side. E.g. the 
> ReferencePendingListLockerThread could be removed completely.

I’ve been thinking along the same lines.  I think having the pending list (and 
associated locking and notification) in Java is just making life difficult for 
ourselves, and that things could be much simpler if that whole protocol was 
owned by the VM.

Once the reference handler thread has obtained the latest list, if it then 
wants to publish that list for other Java threads to help process, that’s a 
policy choice that can be explored on the Java side, with no impact on the VM 
(including the GC).

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden


Hi Peter,

On 2016-03-21 16:30, Peter Levart wrote:

Hi Per,

May I point you to my proposed change in Reference(Handler) for JDK 9,
being discussed in the thread about JDK-8149925. It will hopefully
remove the special-casing of sun.misc.Cleaner, change the way how
pending references are being enqueued by ReferenceHandler thread and how
other thread(s) can synchronize with it. Since you seem to have a great
knowledge of VM part of things, I would very much like to hear what you
think of that change. Here's the latest webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/


(see Reference.java and Bits.java for an example of how this
synchronization with ReferenceHandler thread is to be used)


One thing I like about this approach is that it's only the 
ReferenceHandler thread that pops of elements from the pending list and 
enqueues them. That simplifies things a lot.


From a GC perspective I would however like to get away from the shared 
pending list and the pending list lock entirety and instead provide a VM 
downcall to get the pending list. The goal would of course be to have a 
more robust way of transferring the pending list to Java land, instead 
of today's secret handshake which is easy to get wrong. Also, not 
requiring the pending list lock (which is a Java monitor) to be held 
during a GC would also simplify things a lot on the GC side. E.g. the 
ReferencePendingListLockerThread could be removed completely.


So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
}

I haven't thought through the details when it comes having additional 
Java threads helping out with Cleaners. The ReferenceHandler would be 
free to use whatever lists/locks is wants to handle this and the GC 
wouldn't know anything about it. But, with the above approach at least 
the interface between the ReferenceHandler and the VM would be pretty 
clear and hard(er) to misuse.


cheers,
Per



Regards, Peter

On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner
will go away in JDK9, so all this speculation about the source of
OOME(s) can be put to rest. But for JDK 8u, I agree that this should
be sorted out.

My feeling is that (instanceof Cleaner) can not result in allocation
and therefore can not trigger OOME if the Cleaner class is already
loaded at that time. I think that we were chasing the wrong rabbit. As
I have found later, there is a much more probable cause for
ReferenceHandler thread dying with OOME after the fix to catch OOME
from lock.wait(). It is triggered by the invocation of Cleaner.clean()
later down in the code. I even created a reproducer for it. See my
last two comments of the following issue:

https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof
Cleaner) checks outside the synchronized block and in addition, find a
way to handle the OOME thrown from Cleaner.clean().

What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me
like we
have a bug

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden

On 2016-03-21 18:32, Kim Barrett wrote:

On Mar 21, 2016, at 8:20 AM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/

I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed down :)

While investigating a Reference pending list issue on the GC side of things I looked at the
ReferenceHandler thread and noticed something which made me uneasy. The fix for JDK-8022321 added
pre-loading of the Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an OOME. I understand
this was done because we're not 100% sure if a OOME can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in turn means it can provoke a
GC. If that happens, it looks to me like we have a bug here. The ReferenceHandler thread is not
allowed to provoke a GC while it's holding on to the pending list lock, since the pending list
might be updated during a GC and "pending = r.discovered" will than overwrite something
other than "r", silently dropping any newly discovered References which will never be
discovered by the the GC again.

On the other hand, if an OOME can never happen (i.e. no GC) here then we're
good the comment is just incorrect. The instanceof check could be moved out of
the try/catch block again, like it was prior to this change, just to make it
obvious that we will not be able to cause new allocations inside the critical
section. Or at a minimum, the comment saying OOME can still happen should be
adjusted.

Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any
GC activity from instanceof, but I can't say that am a 100% sure either.

Per - I think you are raising the same issue as discussed in
https://bugs.openjdk.java.net/browse/JDK-8055232.

Ah, thanks Kim for pointing that out.

cheers,
Per

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden


Hi Peter,

On 2016-03-21 16:13, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner will
go away in JDK9, so all this speculation about the source of OOME(s) can
be put to rest. But for JDK 8u, I agree that this should be sorted out.

My feeling is that (instanceof Cleaner) can not result in allocation and
therefore can not trigger OOME if the Cleaner class is already loaded at
that time. I think that we were chasing the wrong rabbit. As I have
found later, there is a much more probable cause for ReferenceHandler
thread dying with OOME after the fix to catch OOME from lock.wait(). It
is triggered by the invocation of Cleaner.clean() later down in the
code. I even created a reproducer for it. See my last two comments of
the following issue:

https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof
Cleaner) checks outside the synchronized block and in addition, find a
way to handle the OOME thrown from Cleaner.clean().

What do you think?


That sound good to me. With the addition of the try/catch around 
Cleaner.clean() catching not just OOME, but all Throwables, right?


cheers,
Per




Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to
provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will
than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have
unintentionally introduced a new one. Prior to JDK-8022321 we knew
that the ReferenceHandler couldn't provoke a GC while manipulating the
pending list, since the code was:

synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly
rules and handshakes between the GC and the ReferenceHandler, which
only works because we control of both.



How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM
operation, on behalf of the GC to allow the GC the manipulate the
pending list. If the thread issuing the VM operation is the
ReferenceHandler, then the monitor is taken recursively, which is ok
as long as ReferenceHandler isn't in the middle of unlinking an element.




On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes

On 22/03/2016 3:32 AM, Kim Barrett wrote:

On Mar 21, 2016, at 8:20 AM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/

Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any
GC activity from instanceof, but I can't say that am a 100% sure either.

Per - I think you are raising the same issue as discussed in
https://bugs.openjdk.java.net/browse/JDK-8055232.

That bug somehow escaped my notice as well. :(

Thanks,
David
-

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes




On 21/03/2016 11:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have
unintentionally introduced a new one. Prior to JDK-8022321 we knew that
the ReferenceHandler couldn't provoke a GC while manipulating the
pending list, since the code was:

synchronized (lock) {
 if (pending != null) {
 r = pending;
 pending = r.discovered;
 r.discovered = null;
 } else {
 
 }
}


Except that it actually could if the wait() in the else part was 
interrupted. But yes the move of instanceof did add another potential 
allocation point (as follow up bugs showed) but the pre-loading does 
seem to have addressed that (though perhaps not with 100% certainty).



The manipulation of the pending list is built on some secret/ugly rules
and handshakes between the GC and the ReferenceHandler, which only works
because we control of both.


Unfortunately implicit allocation was not given enough consideration. 
Which really makes me concerned about the possibility of this code being 
JIT-compiled by a Java compiler under JVMCI!




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM
operation, on behalf of the GC to allow the GC the manipulate the
pending list. If the thread issuing the VM operation is the
ReferenceHandler, then the monitor is taken recursively, which is ok as
long as ReferenceHandler isn't in the middle of unlinking an element.


Ah I see.

Thanks,
David
-




On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Kim Barrett

> On Mar 21, 2016, at 8:20 AM, Per Liden  wrote:
> 
> Hi Peter & David,
> 
> (Resurrecting an old thread here...)
> 
> On 2014-01-22 03:19, David Holmes wrote:
>> Hi Peter,
>> 
>> On 22/01/2014 12:00 AM, Peter Levart wrote:
>>> Hi, David, Kalyan,
>>> 
>>> Summing up the discussion, I propose the following patch for
>>> ReferenceHandler:
>>> 
>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/
>>> 
>> 
>> I can live with it, though it maybe that once Cleaner has been preloaded
>> instanceof can no longer throw OOME. Can't be 100% sure. And there's
>> some duplication/verbosity in the commentary that could be trimmed down :)
> 
> While investigating a Reference pending list issue on the GC side of things I 
> looked at the ReferenceHandler thread and noticed something which made me 
> uneasy. The fix for JDK-8022321 added pre-loading of the Cleaer class to 
> avoid OMME, but also moved the "instanceof Cleaner" inside the try/catch with 
> a comment that it "sometimes" can throw an OOME. I understand this was done 
> because we're not 100% sure if a OOME can still happen here, despite the 
> pre-loading.
> 
> However, if it can throw an OOME that means it's allocating, which in turn 
> means it can provoke a GC. If that happens, it looks to me like we have a bug 
> here. The ReferenceHandler thread is not allowed to provoke a GC while it's 
> holding on to the pending list lock, since the pending list might be updated 
> during a GC and "pending = r.discovered" will than overwrite something other 
> than "r", silently dropping any newly discovered References which will never 
> be discovered by the the GC again.
> 
> On the other hand, if an OOME can never happen (i.e. no GC) here then we're 
> good the comment is just incorrect. The instanceof check could be moved out 
> of the try/catch block again, like it was prior to this change, just to make 
> it obvious that we will not be able to cause new allocations inside the 
> critical section. Or at a minimum, the comment saying OOME can still happen 
> should be adjusted.
> 
> Thoughts?
> 
> thanks,
> Per
> 
> Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any 
> GC activity from instanceof, but I can't say that am a 100% sure either.

Per - I think you are raising the same issue as discussed in 
https://bugs.openjdk.java.net/browse/JDK-8055232.

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart




On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner 
will go away in JDK9, so all this speculation about the source of 
OOME(s) can be put to rest. But for JDK 8u, I agree that this should 
be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation 
and therefore can not trigger OOME if the Cleaner class is already 
loaded at that time. I think that we were chasing the wrong rabbit. As 
I have found later, there is a much more probable cause for 
ReferenceHandler thread dying with OOME after the fix to catch OOME 
from lock.wait(). It is triggered by the invocation of Cleaner.clean() 
later down in the code. I even created a reproducer for it. See my 
last two comments of the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


OTOH, If you are not 100% sure about instanceof doing allocation, then a 
simple fix would be to re-check the 'pending' field if it still points 
to the same object as before instanceof check:



synchronized (lock) {
while ((r = pending) != null) {
// 'instanceof' might throw OutOfMemoryError sometimes
// so do this before un-linking 'r' from the 
'pending' chain...

c = r instanceof Cleaner ? (Cleaner) r : null;
// unlink 'r' from 'pending' chain if it is still 
the same as before
// 'instanceof' check which might have triggered GC 
and GC might
// have discovered some more references and hooked 
them on

// the pending list...
if (pending == r) {
pending = r.discovered;
r.discovered = null;
break;
}
}
if (r == null) {
// The waiting on the lock may cause an 
OutOfMemoryError

// because it may try to allocate exception objects.
if (waitForNotify) {
lock.wait();
}
// retry if waited
return waitForNotify;
}
}


Regards, Peter




On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me 
like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java 
code to

indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating 
the pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart


Hi Per,

May I point you to my proposed change in Reference(Handler) for JDK 9, 
being discussed in the thread about JDK-8149925. It will hopefully 
remove the special-casing of sun.misc.Cleaner, change the way how 
pending references are being enqueued by ReferenceHandler thread and how 
other thread(s) can synchronize with it. Since you seem to have a great 
knowledge of VM part of things, I would very much like to hear what you 
think of that change. Here's the latest webrev:


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/

(see Reference.java and Bits.java for an example of how this 
synchronization with ReferenceHandler thread is to be used)


Regards, Peter

On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner 
will go away in JDK9, so all this speculation about the source of 
OOME(s) can be put to rest. But for JDK 8u, I agree that this should 
be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation 
and therefore can not trigger OOME if the Cleaner class is already 
loaded at that time. I think that we were chasing the wrong rabbit. As 
I have found later, there is a much more probable cause for 
ReferenceHandler thread dying with OOME after the fix to catch OOME 
from lock.wait(). It is triggered by the invocation of Cleaner.clean() 
later down in the code. I even created a reproducer for it. See my 
last two comments of the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me 
like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java 
code to

indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating 
the pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly 
rules and handshakes between the GC and the ReferenceHandler, which 
only works because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread 
holds

the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok 
as long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart


Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner will 
go away in JDK9, so all this speculation about the source of OOME(s) can 
be put to rest. But for JDK 8u, I agree that this should be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation and 
therefore can not trigger OOME if the Cleaner class is already loaded at 
that time. I think that we were chasing the wrong rabbit. As I have 
found later, there is a much more probable cause for ReferenceHandler 
thread dying with OOME after the fix to catch OOME from lock.wait(). It 
is triggered by the invocation of Cleaner.clean() later down in the 
code. I even created a reproducer for it. See my last two comments of 
the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating the 
pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly 
rules and handshakes between the GC and the ReferenceHandler, which 
only works because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok 
as long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Per Liden


Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/





I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew that 
the ReferenceHandler couldn't provoke a GC while manipulating the 
pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly rules 
and handshakes between the GC and the ReferenceHandler, which only works 
because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok as 
long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes


Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/




I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC again.


Then the code was completely broken because it was obviously capable of 
allocating whilst holding the lock. There is nothing in the Java code to 
indicate allocation should not happen and no way that Java code can 
directly control that! We were only fixing the problem of the exception 
killing the thread, not trying to address an undisclosed illegal 
allocation problem!


How would a GC thread update pending if the ReferenceHandlerThread holds 
the lock?



On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or 
not the instanceof could ever trigger an allocation and hence 
potentially an OOME.


With JVMCI it is now easier to imagine that compilation of this code by 
a JVMCI compiler might lead to allocation while the lock is held!


Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be
good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Per Liden


Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/



I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed down :)


While investigating a Reference pending list issue on the GC side of 
things I looked at the ReferenceHandler thread and noticed something 
which made me uneasy. The fix for JDK-8022321 added pre-loading of the 
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner" 
inside the try/catch with a comment that it "sometimes" can throw an 
OOME. I understand this was done because we're not 100% sure if a OOME 
can still happen here, despite the pre-loading.


However, if it can throw an OOME that means it's allocating, which in 
turn means it can provoke a GC. If that happens, it looks to me like we 
have a bug here. The ReferenceHandler thread is not allowed to provoke a 
GC while it's holding on to the pending list lock, since the pending 
list might be updated during a GC and "pending = r.discovered" will than 
overwrite something other than "r", silently dropping any newly 
discovered References which will never be discovered by the the GC again.


On the other hand, if an OOME can never happen (i.e. no GC) here then 
we're good the comment is just incorrect. The instanceof check could be 
moved out of the try/catch block again, like it was prior to this 
change, just to make it obvious that we will not be able to cause new 
allocations inside the critical section. Or at a minimum, the comment 
saying OOME can still happen should be adjusted.


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should 
avoid any GC activity from instanceof, but I can't say that am a 100% 
sure either.




Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Alan Bateman


On 29/01/2014 19:10, Mandy Chung wrote:


On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles in 
both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super 
Object for now as there is no supertype for Object and this looks a 
little odd.  We can clean this up as a separate fix after we get 
clarification from compiler-dev.
I see Peter has posted a question to compiler-dev on this and it can 
always be re-visited once it clear why it compiles when both Reference 
and ReferenceQueue are in the same compilation unit.


-Alan

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Peter Levart


On 01/30/2014 03:46 PM, Alan Bateman wrote:

On 29/01/2014 19:10, Mandy Chung wrote:


On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles 
in both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super 
Object for now as there is no supertype for Object and this looks a 
little odd.  We can clean this up as a separate fix after we get 
clarification from compiler-dev.
I see Peter has posted a question to compiler-dev on this and it can 
always be re-visited once it clear why it compiles when both Reference 
and ReferenceQueue are in the same compilation unit.


-Alan


I Just commited the version with no change to ReferenceQueueObject 
line to jdk9/dev. If there is a bug in javac and the code would not 
compile as is, the change to this line should be committed as part of 
javac fix, right?


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Alan Bateman


On 30/01/2014 14:51, Peter Levart wrote:


I Just commited the version with no change to ReferenceQueueObject 
line to jdk9/dev. If there is a bug in javac and the code would not 
compile as is, the change to this line should be committed as part of 
javac fix, right?


It's good to get this change in. If javac were to be changed to reject 
this code then it need to be changed at the same time (but I guess we 
wait to see if this is case as it's just not obvious yet).


-Alan

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-29 Thread Peter Levart


On 01/28/2014 04:46 PM, Alan Bateman wrote:

On 28/01/2014 08:44, Peter Levart wrote:


Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when 
building via OpenJDK build system make files (using make images in 
top directory for example) but only if I compile the class from 
command line (using javac directly) or from IDEA. I use JDK 8 ea-b121 
in all cases as a build JDK. Are there any special options passed to 
javac for compiling those classes in JDK build system that allow such 
code?


jdk/make/Setup.gmk has the -Xlint options that are used in the build 
but I suspect it more than that all the classes in java/lang/ref are 
compiled together.


-Alan


That's right. If I add the source for ReferenceQueue.java into a 
directory where Reference.java resides and then compile with:


javac -d /tmp Reference.java

...then Reference as well as ReferenceQueue gets compiled and there's no 
error. If there is sole Reference.java in the directory, a compile time 
error is emitted. I checked the source of ReferenceQueue.java in JDK 8 
ea-b121 (the JDK used for compiling) and it only differs in copyright 
year from the source in jdk9-dev. So there seems to be inconsistency in 
javac's handling of types that are read from .class vs. .java files.


I'll try to create a reproducer example and post it to compiler-dev.

Since I don't know what should be the correct behaviour of javac, I can 
leave the Reference.java changes as proposed since it compiles in both 
cases. Or should I revert the change to declaration of local variable 'q' ?


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-29 Thread Mandy Chung



On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles in 
both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super Object 
for now as there is no supertype for Object and this looks a little 
odd.  We can clean this up as a separate fix after we get clarification 
from compiler-dev.


Mandy

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-28 Thread Peter Levart

On 01/28/2014 03:17 AM, David Holmes wrote:

On 27/01/2014 5:07 AM, Peter Levart wrote:

On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:

Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.

Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/

I changed into using Class.forName() instead of Unsafe for class
preloading and initialization just to be on the safe side regarding
unwanted premature initialization of Unsafe class. I also took the
liberty of removing an unneeded semicolon (line 114) and fixing a JDK 8
compile time error in generics (line 189):

incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object

Seems somewhat odd given there is no supertype for Object but it is
consistent with the field declaration:

ReferenceQueue? super T queue;

The generics here is a little odd as we don't really know the type of
T we just play fast-and-loose by declaring:

ReferenceObject r;

Which only works because of erasure. I guess it wouldn't work to try
and use a simple wildcard '?' for both 'r' and 'q' as they would be
different captures to javac.

Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when building
via OpenJDK build system make files (using make images in top
directory for example) but only if I compile the class from command line
(using javac directly) or from IDEA. I use JDK 8 ea-b121 in all cases as
a build JDK. Are there any special options passed to javac for compiling
those classes in JDK build system that allow such code?

Regards, Peter

I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David,
I'll commit this to jdk9/dev...

I can be counted as the Reviewer. Kalyan can be listed as a reviewer.

Thanks Peter.

David
-

Regards, Peter

--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the
change was eliminate double reading of 'pending' field. I have a
mental model of local variable being a register and field being a
memory location. This may be important if the field is volatile, but
for normal fields, I guess the optimizer knows how to compile such
code most optimally in either case. The old (your) version is better
from logical perspective, since it guarantees that dereferencing the
'r', wherever it is possible, will never throw NPE (dereferencing
where 'r' is not assigned is not possible because of definitive
assignment rules). So I support going back to your version...

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned
pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in
native

memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there
be an
OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and
we see
OOME on JDK8 too). Each time a class is loaded, new
java.lang.Class

object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will
give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-28 Thread Alan Bateman


On 28/01/2014 08:44, Peter Levart wrote:


Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when 
building via OpenJDK build system make files (using make images in 
top directory for example) but only if I compile the class from 
command line (using javac directly) or from IDEA. I use JDK 8 ea-b121 
in all cases as a build JDK. Are there any special options passed to 
javac for compiling those classes in JDK build system that allow such 
code?


jdk/make/Setup.gmk has the -Xlint options that are used in the build but 
I suspect it more than that all the classes in java/lang/ref are 
compiled together.


-Alan

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-27 Thread David Holmes

On 27/01/2014 5:07 AM, Peter Levart wrote:

On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:

Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.

Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/

incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object

Seems somewhat odd given there is no supertype for Object but it is
consistent with the field declaration:

ReferenceQueue? super T queue;

The generics here is a little odd as we don't really know the type of T
we just play fast-and-loose by declaring:

ReferenceObject r;

Which only works because of erasure. I guess it wouldn't work to try and
use a simple wildcard '?' for both 'r' and 'q' as they would be
different captures to javac.

I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David,
I'll commit this to jdk9/dev...

I can be counted as the Reviewer. Kalyan can be listed as a reviewer.

Thanks Peter.

David
-

Regards, Peter

--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned
pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there
be an
OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will
give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.

So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone
data
at java.home/lib/zi with JSR310's tzdb

Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-27 Thread Mandy Chung



On 1/26/2014 11:07 AM, Peter Levart wrote:


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further 
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review 
comments:


cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/



This looks good to me.  Sorry I have been behind in following the 
discussion of this thread.  It's good to see this problem be diagnosed 
and fixed (thank you all).


I also prefer using Class.forName to do the preloading and initialization.

Mandy

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-26 Thread Peter Levart

On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.

Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/

incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object

I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David,
I'll commit this to jdk9/dev...

Regards, Peter

--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned

pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there
be an

OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will
give

the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.

So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone
data

at java.home/lib/zi with JSR310's tzdb

Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-26 Thread srikalyan chandrashekar

On 1/26/14 11:07 AM, Peter Levart wrote:

On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.

Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/

incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object

I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from
David, I'll commit this to jdk9/dev...
Hi Peter, I do not have review rights. So it has to be someone else from
core-libs-dev.

Regards, Peter

--
Thanks
kalyan

--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned

pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug
in Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like
to run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there
be an

OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and we
see

OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will
give

the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.

So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk
timezone data

at java.home/lib/zi with JSR310's tzdb

Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread Peter Levart

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the change
was eliminate double reading of 'pending' field. I have a mental model
of local variable being a register and field being a memory location.
This may be important if the field is volatile, but for normal fields, I
guess the optimizer knows how to compile such code most optimally in
either case. The old (your) version is better from logical perspective,
since it guarantees that dereferencing the 'r', wherever it is possible,
will never throw NPE (dereferencing where 'r' is not assigned is not
possible because of definitive assignment rules). So I support going
back to your version...

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned

pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait() and
then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.

So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb

Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread Peter Levart



On 01/22/2014 03:19 AM, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 



I can live with it, though it maybe that once Cleaner has been 
preloaded instanceof can no longer throw OOME. Can't be 100% sure. And 
there's some duplication/verbosity in the commentary that could be 
trimmed down :)


Any specific reason to use Unsafe to do the preload rather than 
Class.forName ? Does this force Unsafe to be loaded earlier than it 
otherwise would?


Good question. In systemDictionary.hpp they are both on the preloaded 
list in this order:


  do_klass(Reference_klass, java_lang_ref_Reference,   
Pre ) \

...
  do_klass(misc_Unsafe_klass, 
sun_misc_Unsafe,   Pre ) \



So when Reference is initialized, the Unsafe is already loaded. But I 
don't know if it is already initialized. This should be studied.


I'll try to find out what is the case and get back to you.

Regards, Peter




Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be 
good

to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, 
nothing

to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is 
rised a

consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility 
that we
spin, consume too much CPU (ReferenceHandler thread has MAX 
priority) so
that other threads dont' get enough CPU time to proceed and clean 
things
up (we hope other threads will also get OOME and release things as 
their

stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown
after GC fails to create space, so it really needs some other thread
to drop live references to allow further space to be reclaimed.

But note that Thread.yield() can behave badly on some linux systems
too, so spinning is still a possibility - but either way this would
only be really bad on a uniprocessor system where yield() is
unlikely to misbehave.

David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread srikalyan chandrashekar

Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.

--
Thanks
kalyan

On 1/24/14 4:05 PM, Peter Levart wrote:

On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.

Hi Kalyan,

Good to hear that test runs without failures so far.

Regards, Peter

--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
..
TO
if (pending != null) {
r = pending;

This is because the r is used later in the code and must not be
assigned

pending unless it is not null(this was as is earlier).

If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).

The class_mirror is a Java object not meta-data.

David

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:

On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there
be an

OOME in object heap due to class loading in perm gen space ?

The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.

So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone
data

at java.home/lib/zi with JSR310's tzdb

Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan


Hi Peter, i have modified your code from

r = pending;
if (r != null) {
 ..


  TO


if (pending != null) {
 r = pending;

This is because the r is used later in the code and must not be assigned 
pending unless it is not null(this was as is earlier). The new webrev is 
posted here 
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 
. I ran a 1000 run and no failures so far, however i would like to run a 
couple more 1000 runs to assert the fix.


PS: The description section of JEP-122 
(http://openjdk.java.net/jeps/122) says meta-data would be in native 
memory(not heap).


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule 
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan


Hi Peter/David, we have 2000 runs without a single failure.

--
Thanks
kalyan
Ph: (408)-585-8040


On 1/23/14, 12:10 PM, srikalyan wrote:

Hi Peter, i have modified your code from
r = pending;
if (r != null) {
  ..


   TO


if (pending != null) {
  r = pending;
This is because the r is used later in the code and must not be 
assigned pending unless it is not null(this was as is earlier). The 
new webrev is posted here 
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 
. I ran a 1000 run and no failures so far, however i would like to run 
a couple more 1000 runs to assert the fix.


PS: The description section of JEP-122 
(http://openjdk.java.net/jeps/122) says meta-data would be in native 
memory(not heap).

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be 
an OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule 
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone 
data at java.home/lib/zi with JSR310's tzdb



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread David Holmes


On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and 
then continue - back to the top of the loop. There is no bug in Peter's 
code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/
. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan chandrashekar

Hi David, yes thats right, only benefit i see is we can avoid assignment 
to 'r' if pending is null.


--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and 
then continue - back to the top of the loop. There is no bug in 
Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 


. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread David Holmes


On 24/01/2014 11:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid assignment
to 'r' if pending is null.


I'm okay with either version.

David


--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and
then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility that we
spin, consume too much CPU (ReferenceHandler thread has MAX priority) so
that other threads dont' get enough CPU time to proceed and clean things
up (we hope other threads will also get OOME and release things as their
stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown 
after GC fails to create space, so it really needs some other thread to 
drop live references to allow further space to be reclaimed.


But note that Thread.yield() can behave badly on some linux systems too, 
so spinning is still a possibility - but either way this would only be 
really bad on a uniprocessor system where yield() is unlikely to 
misbehave.


David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing
OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in this
situation. And will never be doing it again after the heap is
freed...
So it might be good to load and initialize Cleaner class as part of
ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap
while
loading and initializing the Cleaner class.


Yes - I was assuming that this would not fail the very first time and
so the Cleaner class would already be loaded. Failing to be able to
load the Cleaner class was one of the potential issues flagged
earlier with this problem. I was actually assuming that Cleaner would
be loaded already due to some actual Cleaner subclasses being used,
but this does not happen as part of the default initialization. :(
The irony being that if the Cleaner class is not

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for 
ReferenceHandler:


http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/

all 10 java/lang/ref tests pass on my PC (including OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test 
with this code and report any failure.



Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for 
InterruptedException too.



Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility that we
spin, consume too much CPU (ReferenceHandler thread has MAX priority) so
that other threads dont' get enough CPU time to proceed and clean things
up (we hope other threads will also get OOME and release things as their
stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown 
after GC fails to create space, so it really needs some other thread 
to drop live references to allow further space to be reclaimed.


But note that Thread.yield() can behave badly on some linux systems 
too, so spinning is still a possibility - but either way this would 
only be really bad on a uniprocessor system where yield() is 
unlikely to misbehave.


David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner 
throwing

OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in 
this

situation. And will never be doing it again after the heap is
freed...
So it might be good to load and initialize Cleaner class as 
part of

ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap
while
loading and initializing the Cleaner class.


Yes - I was assuming

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


On 01/21/2014 08:57 AM, David Holmes wrote:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ?? 


The Cleaner is usually loaded by ReferenceHandler in JDK8 in the 1st 
execution of it's loop. It looks like JDK8 system initialization 
produces at least one XXXReference that is cleared before main() method 
is entered (debugging, I found it's a Finalizer for a FileInputStream - 
perhaps of the stream that loads the TimeZone data), so ReferenceHandler 
thread is woken-up, executes the instanceof Cleaner check and this loads 
the class. I put the following printfs in an original ReferenceHandler:


System.out.println(Before using Cleaner...);
// Fast path for cleaners
if (r instanceof Cleaner) {
((Cleaner)r).clean();
continue;
}
System.out.println(After using Cleaner...);


...and the empty main() test with -verbose:class prints:

...
[Loaded java.io.DataInput from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]

*Before using Cleaner...**
**[Loaded sun.misc.Cleaner from out/production/jdk]**
**After using Cleaner...*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]


...


But sometimes, It seems, the VM is not so quick in clearing the early 
XXXReferences and/or the ReferenceHandler start-up is delayed and the 
1st iteration of the loop is executed after the OOMEInReferenceHandler 
test already fills the heap and consequently loading of Cleaner class 
throws OOME in instanceof check...


My proposed fix is very aggressive. It pre-loads classes, initializes 
them and watches for OOMEs thrown in all ocasions. It might be that 
pre-loading Cleaner class in ReferenceHandler initialization would be 
sufficient to fix this intermittent failure. Or do you think instanceof 
check could throw OOME for some other reason besides loading of the class?



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data at 
java.home/lib/zi with JSR310's tzdb



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread srikalyan

Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ? Please 
correct if i am missing something here. Meanwhile i will give the 
version of Reference Handler you both agreed on a try.


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently



On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


On 22/01/2014 1:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data at
java.home/lib/zi with JSR310's tzdb


I suspect it also depends on your TZ environment too as I do not see 
this on my systems.


David




Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently


On 22/01/2014 8:31 AM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.


For the bootloader classes I thought, but could easily be wrong, that 
the Class mirror did indeed go into the PermGen. But still this is not 
relevant on JDK8 where there is no PermGen. It maybe that changed as 
part of the early PermGen removal prep work that did go into 7u.


David


Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently