Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-06-24 Thread Kim Barrett
I've just sent out an RFR for

8156500: deadlock provoked by new stress test com/sun/jdi/OomDebugTest.java

The proposed fix incorporates the change suggested by Per and
discussed in this thread of moving the pending reference list
management entirely into the VM.



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-24 Thread Peter Levart

Hi Kim,

On 03/23/2016 09:40 PM, Kim Barrett wrote:

I don't think there's any throughput penalty for a long timeout.  The
proper response to waitForCleanups returning false (assuming the epoch
was obtained early and passed as an argument) is OOME.  I really doubt
the latency for reporting OOME is of critical importance.


The above assumption is not entirely correct. The correct response to 
waitForCleanups returning false should be at least one attempt to 
trigger GC reference discovery 1st and only after that it should be 
OOME. Suppose a program tries to allocate direct memory above the limit. 
Waiting for cleanups to happen might be very long if there's no heap 
memory pressure although there might be already lots of unreachable 
direct buffers on the heap.


So guessing the right timeout before attempting to trigger GC is not 
trivial. If you make it to small, there will be excessive GCs triggered 
and throughput will suffer. If you make it to long, throughput will 
suffer again.


Nevertheless I managed to create a variant that self-adjusts the timeout 
based on the last successful wait time. At least with the 
DirectBufferAllocTest using 16 or 32 allocating threads (on 4-core CPU) 
the throughput is comparable as before and what's important, the test 
passes:


java -XX:MaxDirectMemorySize=128m -cp out DirectBufferAllocTest -r 600 
-t 16 -p 5000
Allocating direct ByteBuffers with capacity 1048576 bytes, using 16 
threads for 600 seconds, printing the average per-thread latency of 5000 
consecutive allocations...

Thread 11:  1.94 ms/allocation
Thread  6:  1.97 ms/allocation
Thread 12:  2.05 ms/allocation
Thread  0:  2.10 ms/allocation
Thread  7:  2.15 ms/allocation
Thread  3:  2.16 ms/allocation
Thread  1:  2.26 ms/allocation
Thread  5:  2.32 ms/allocation
Thread  2:  2.33 ms/allocation
Thread  4:  2.34 ms/allocation
Thread 13:  2.36 ms/allocation
Thread  9:  2.38 ms/allocation
Thread 14:  2.40 ms/allocation
Thread 10:  2.40 ms/allocation
Thread  8:  2.42 ms/allocation
Thread 15:  2.44 ms/allocation
Thread  6:  1.72 ms/allocation
Thread 11:  1.75 ms/allocation
Thread 12:  1.86 ms/allocation
Thread  0:  1.86 ms/allocation
Thread  3:  1.94 ms/allocation
Thread  7:  2.07 ms/allocation
Thread  1:  2.08 ms/allocation
Thread  2:  2.12 ms/allocation
Thread  4:  2.14 ms/allocation
Thread  5:  2.16 ms/allocation
Thread  9:  2.13 ms/allocation

Here's the webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.10.part2/

So what do you think?

Regards, Peter



That is, the caller looks something like (not even pretending to write
Java)

   alloc = tryAllocatation(allocSize)
   if alloc != NULL
 return alloc
   endif
   // Maybe add a retry+wait with a short timeout here,
   // to allow existing cleanups to run before requesting
   // another gc.  Not clear that's really worthwhile, as
   // it only comes up when we get here just after a gc
   // and the resulting cleanups are not yet all processed.
   System.gc()
   while true
 epoch = getEpoch()
 alloc = tryAllocation(allocSize)
 if alloc != NULL
   return alloc
 elif !waitForCleanup(epoch)
   throw OOME  // No cleanup progress for a while
 endif
   end




Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Kim Barrett
> On Mar 23, 2016, at 4:42 PM, Peter Levart  wrote:
> 
> Hi Kim,
> 
> Thinking more about your approach. Basically your idea is to detect that 
> there are no more unprocessed but pending or enqueued Cleanables by timing 
> out on waiting for next Cleanable to be processed. In that case the timeout 
> should be reset when each Cleanable is detected to be processed so that when 
> there's a "silence" detected for at least the whole timeout period, we can 
> claim with enough probability that there are no more unprocessed Cleanables 
> either pending or enqueued and that we can give up with OOME.

Exactly, and much better stated than I did.



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Peter Levart



On 03/23/2016 09:40 PM, Kim Barrett wrote:

On Mar 23, 2016, at 3:33 PM, Peter Levart  wrote:

Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart 
  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits already maintains the amount 
of currently allocated direct memory (in bytes). What your suggestion leads to is similar to one of 
previous versions of java.nio.Bits which waited for some 'timeout' time after invoking System.gc() 
and then re-tried reservation, failing if it didn't succeed. The problem with such 
"asynchronous" approach is that there's no right value of 'timeout' for all situations. 
If you wait for to short time, you might get OOME although there are plenty unreachable but still 
uncleaned direct buffers. If you wait for to long, your throughput will suffer. There has to be 
some "feedback" from reference processing to know when there's still beneficial to wait 
and when there's no point in waiting any more.

Regards, Peter

I don't think there's any throughput penalty for a long timeout.  The
proper response to waitForCleanups returning false (assuming the epoch
was obtained early and passed as an argument) is OOME.  I really doubt
the latency for reporting OOME is of critical importance.

That is, the caller looks something like (not even pretending to write
Java)

   alloc = tryAllocatation(allocSize)
   if alloc != NULL
 return alloc
   endif
   // Maybe add a retry+wait with a short timeout here,
   // to allow existing cleanups to run before requesting
   // another gc.  Not clear that's really worthwhile, as
   // it only comes up when we get here just after a gc
   // and the resulting cleanups are not yet all processed.
   System.gc()
   while true
 epoch = getEpoch()
 alloc = tryAllocation(allocSize)
 if alloc != NULL
   return alloc
 elif !waitForCleanup(epoch)
   throw OOME  // No cleanup progress for a while
 endif
   end



Right, this is easier to understand. I already figured out what you 
wanted to say the 1st time. I'll try to prepare a prototype along this 
idea tomorrow.


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Peter Levart

Hi Kim,

Thinking more about your approach. Basically your idea is to detect that 
there are no more unprocessed but pending or enqueued Cleanables by 
timing out on waiting for next Cleanable to be processed. In that case 
the timeout should be reset when each Cleanable is detected to be 
processed so that when there's a "silence" detected for at least the 
whole timeout period, we can claim with enough probability that there 
are no more unprocessed Cleanables either pending or enqueued and that 
we can give up with OOME.


Let me try to see with a prototype if this approach leads to success...

Regards, Peter

On 03/23/2016 08:33 PM, Peter Levart wrote:

Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits 
already maintains the amount of currently allocated direct memory (in 
bytes). What your suggestion leads to is similar to one of previous 
versions of java.nio.Bits which waited for some 'timeout' time after 
invoking System.gc() and then re-tried reservation, failing if it 
didn't succeed. The problem with such "asynchronous" approach is that 
there's no right value of 'timeout' for all situations. If you wait 
for to short time, you might get OOME although there are plenty 
unreachable but still uncleaned direct buffers. If you wait for to 
long, your throughput will suffer. There has to be some "feedback" 
from reference processing to know when there's still beneficial to 
wait and when there's no point in waiting any more.


Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Kim Barrett
> On Mar 23, 2016, at 3:33 PM, Peter Levart  wrote:
> 
> Hi Kim,
> 
> On 03/23/2016 07:55 PM, Kim Barrett wrote:
>>> On Mar 23, 2016, at 10:02 AM, Peter Levart 
>>>  wrote:
>>> ...so I checked what it would be needed if there was such 
>>> getPendingReferences() native method. It turns out that a single native 
>>> method would not be enough to support the precise direct ByteBuffer 
>>> allocation. Here's a refactored webrev that introduces a 
>>> getPendingReferences() method which could be turned into a native 
>>> equivalent one day. There is another native method needed - int 
>>> awaitEnqueuePhaseStart():
>>> 
>>> 
>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/
>> I don't think the Reference.awaitEnqueuePhaseStart thing is needed.
>> 
>> Rather, I think the Direct-X-Buffer allocation should conspire with
>> the the Direct-X-Buffer cleanups directly to manage that sort of
>> thing, and not add anything to Reference and the reference processing
>> thread.  E.g. the phase and signal/wait are purely part of
>> Direct-X-Buffer.  (I also think something like that could/should have
>> been done instead of providing Direct-X-Buffer with access to
>> Reference.tryHandlePending, but that's likely water under the bridge
>> now.)
>> 
>> Something very roughly like this:
>> 
>> allocating thread, after allocation failed
>> 
>> bool waitForCleanups() {
>>   int epoch = DXB.getCleanupCounter();
>>   long start = startTime();
>>   long timeout = calcTimeout(start)
>>   synchronized (DXB.getCleanupMonitor()) {
>> while (epoch == DBX.getCleanupCounter()) {
>>   wait(timeout);
>>   timeout = calcTimeout(start);
>>   if (timeout <= 0) break;
>> }
>> return epoch != DBX.getCleanupCounter();
>>   }
>> }
>> 
>> cleanup function, after freeing memory
>> 
>>   synchronized (DBX.getCleanupMonitor()) {
>> DBX.incCleanupCounter();
>> DBX.getCleanupMonitor().notify_all();
>>   }
>> 
>> Actually, epoch should probably have been obtained *before* the failed
>> allocation attempt, and should be an argument to waitForCleanups.
>> 
>> That's all quite sketchy, but I need to do other things today.
>> 
>> Peter, care to try filling this in?
>> 
>> 
> 
> There's no need to maintain a special cleanup counter as java.nio.Bits 
> already maintains the amount of currently allocated direct memory (in bytes). 
> What your suggestion leads to is similar to one of previous versions of 
> java.nio.Bits which waited for some 'timeout' time after invoking System.gc() 
> and then re-tried reservation, failing if it didn't succeed. The problem with 
> such "asynchronous" approach is that there's no right value of 'timeout' for 
> all situations. If you wait for to short time, you might get OOME although 
> there are plenty unreachable but still uncleaned direct buffers. If you wait 
> for to long, your throughput will suffer. There has to be some "feedback" 
> from reference processing to know when there's still beneficial to wait and 
> when there's no point in waiting any more.
> 
> Regards, Peter

I don't think there's any throughput penalty for a long timeout.  The
proper response to waitForCleanups returning false (assuming the epoch
was obtained early and passed as an argument) is OOME.  I really doubt
the latency for reporting OOME is of critical importance.

That is, the caller looks something like (not even pretending to write
Java) 

  alloc = tryAllocatation(allocSize)
  if alloc != NULL
return alloc
  endif
  // Maybe add a retry+wait with a short timeout here,
  // to allow existing cleanups to run before requesting
  // another gc.  Not clear that's really worthwhile, as
  // it only comes up when we get here just after a gc
  // and the resulting cleanups are not yet all processed.
  System.gc()
  while true
epoch = getEpoch() 
alloc = tryAllocation(allocSize)
if alloc != NULL
  return alloc
elif !waitForCleanup(epoch)
  throw OOME  // No cleanup progress for a while
endif
  end



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Peter Levart

Hi Kim,

On 03/23/2016 07:55 PM, Kim Barrett wrote:

On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native method 
would not be enough to support the precise direct ByteBuffer allocation. Here's 
a refactored webrev that introduces a getPendingReferences() method which could 
be turned into a native equivalent one day. There is another native method 
needed - int awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
   int epoch = DXB.getCleanupCounter();
   long start = startTime();
   long timeout = calcTimeout(start)
   synchronized (DXB.getCleanupMonitor()) {
 while (epoch == DBX.getCleanupCounter()) {
   wait(timeout);
   timeout = calcTimeout(start);
   if (timeout <= 0) break;
 }
 return epoch != DBX.getCleanupCounter();
   }
}

cleanup function, after freeing memory

   synchronized (DBX.getCleanupMonitor()) {
 DBX.incCleanupCounter();
 DBX.getCleanupMonitor().notify_all();
   }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



There's no need to maintain a special cleanup counter as java.nio.Bits 
already maintains the amount of currently allocated direct memory (in 
bytes). What your suggestion leads to is similar to one of previous 
versions of java.nio.Bits which waited for some 'timeout' time after 
invoking System.gc() and then re-tried reservation, failing if it didn't 
succeed. The problem with such "asynchronous" approach is that there's 
no right value of 'timeout' for all situations. If you wait for to short 
time, you might get OOME although there are plenty unreachable but still 
uncleaned direct buffers. If you wait for to long, your throughput will 
suffer. There has to be some "feedback" from reference processing to 
know when there's still beneficial to wait and when there's no point in 
waiting any more.


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Kim Barrett
> On Mar 23, 2016, at 10:02 AM, Peter Levart  wrote:
> ...so I checked what it would be needed if there was such 
> getPendingReferences() native method. It turns out that a single native 
> method would not be enough to support the precise direct ByteBuffer 
> allocation. Here's a refactored webrev that introduces a 
> getPendingReferences() method which could be turned into a native equivalent 
> one day. There is another native method needed - int awaitEnqueuePhaseStart():
> 
> http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

I don't think the Reference.awaitEnqueuePhaseStart thing is needed.

Rather, I think the Direct-X-Buffer allocation should conspire with
the the Direct-X-Buffer cleanups directly to manage that sort of
thing, and not add anything to Reference and the reference processing
thread.  E.g. the phase and signal/wait are purely part of
Direct-X-Buffer.  (I also think something like that could/should have
been done instead of providing Direct-X-Buffer with access to
Reference.tryHandlePending, but that's likely water under the bridge
now.)

Something very roughly like this:

allocating thread, after allocation failed

bool waitForCleanups() {
  int epoch = DXB.getCleanupCounter();
  long start = startTime();
  long timeout = calcTimeout(start)
  synchronized (DXB.getCleanupMonitor()) {
while (epoch == DBX.getCleanupCounter()) {
  wait(timeout);
  timeout = calcTimeout(start);
  if (timeout <= 0) break;
}
return epoch != DBX.getCleanupCounter();
  }
}

cleanup function, after freeing memory

  synchronized (DBX.getCleanupMonitor()) {
DBX.incCleanupCounter();
DBX.getCleanupMonitor().notify_all();
  }

Actually, epoch should probably have been obtained *before* the failed
allocation attempt, and should be an argument to waitForCleanups.

That's all quite sketchy, but I need to do other things today.

Peter, care to try filling this in?



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Per Liden

Hi Peter,

On 2016-03-23 15:02, Peter Levart wrote:

Hi Per, Kim,

On 03/22/2016 10:24 AM, Per Liden wrote:

So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
}


...so I checked what it would be needed if there was such
getPendingReferences() native method. It turns out that a single native
method would not be enough to support the precise direct ByteBuffer
allocation. Here's a refactored webrev that introduces a
getPendingReferences() method which could be turned into a native
equivalent one day. There is another native method needed - int
awaitEnqueuePhaseStart():

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/


The need for this additional method arises when one wants to combine
reference discovery with enqueueing of discovered references into one
synchronous operation (discoverAndEnqueueReferences()). A direct
ByteBuffer allocating thread wants to trigger reference discovery
(System.gc()) and wait for discovered references to be enqueued before
continuing with direct memory reservation retries. An alternative to
what I have done in above webrev would be a maintenance of a single
enqueuePhase counter on the Java side with usage roughly as:

discoverAndEnqueueReferences() {
 int phase = Reference.getEnqueuePhase();
 System.gc();
 Reference.awaitEnqueuePhaseGreaterThan(phase);
}

But in that case, System.gc() would have to guarantee that after
discovery of no new references, blocked getPendingReferences() would
still return with an empty list of References (null) just to keep the
DBB allocating thread alive. I have tried to do this variant and
unfortunately it can't be reliably performed with current protocol as
getPendingReferences() can only be programmed to return non-empty
Reference lists without ambiguity. I created a DirectBufferAllocOOMETest
to exercise situations where no new Reference(s) are discovered in a GC
round.

So do what do you think - what would it be easier to support:
a) getPendingReferences() returns empty Reference list (null) after a GC
round that discovers no new pending references
b) getPendingReferences() returns when new Reference(s) are discovered
and there is an additional int awaitEnqueuePhaseStart() as defined in
above webrev.


I've prototyped the VM side. I've ignored the "await" issue for now as I 
first just wanted the basic structure up. I'm running out of time for 
today (and I'll be away the rest of the week) but let's continue the 
discussion next week and figure out the "await" details/alternatives.


Webrevs for jdk9/hs-rt:

http://cr.openjdk.java.net/~pliden/reference_pending_list/webrev.0-jdk
http://cr.openjdk.java.net/~pliden/reference_pending_list/webrev.0-hotspot

It passes jdk/test/java/lang/ref/* and our VM tests for reference 
processing.


cheers,
Per


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Peter Levart

Hi Per, Kim,

On 03/22/2016 10:24 AM, Per Liden wrote:

So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
} 


...so I checked what it would be needed if there was such 
getPendingReferences() native method. It turns out that a single native 
method would not be enough to support the precise direct ByteBuffer 
allocation. Here's a refactored webrev that introduces a 
getPendingReferences() method which could be turned into a native 
equivalent one day. There is another native method needed - int 
awaitEnqueuePhaseStart():


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.09.part2/

The need for this additional method arises when one wants to combine 
reference discovery with enqueueing of discovered references into one 
synchronous operation (discoverAndEnqueueReferences()). A direct 
ByteBuffer allocating thread wants to trigger reference discovery 
(System.gc()) and wait for discovered references to be enqueued before 
continuing with direct memory reservation retries. An alternative to 
what I have done in above webrev would be a maintenance of a single 
enqueuePhase counter on the Java side with usage roughly as:


discoverAndEnqueueReferences() {
int phase = Reference.getEnqueuePhase();
System.gc();
Reference.awaitEnqueuePhaseGreaterThan(phase);
}

But in that case, System.gc() would have to guarantee that after 
discovery of no new references, blocked getPendingReferences() would 
still return with an empty list of References (null) just to keep the 
DBB allocating thread alive. I have tried to do this variant and 
unfortunately it can't be reliably performed with current protocol as 
getPendingReferences() can only be programmed to return non-empty 
Reference lists without ambiguity. I created a DirectBufferAllocOOMETest 
to exercise situations where no new Reference(s) are discovered in a GC 
round.


So do what do you think - what would it be easier to support:
a) getPendingReferences() returns empty Reference list (null) after a GC 
round that discovers no new pending references
b) getPendingReferences() returns when new Reference(s) are discovered 
and there is an additional int awaitEnqueuePhaseStart() as defined in 
above webrev.


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Per Liden

Hi,

On 2016-03-23 08:13, Peter Levart wrote:



On 03/22/2016 10:28 PM, Kim Barrett wrote:

On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
One thing I like about this approach is that it's only the
ReferenceHandler thread that pops of elements from the pending list
and enqueues them. That simplifies things a lot.

I like that too.  And hopefully we really can get rid of
sun.misc.Cleaner (under whatever name).


 From a GC perspective I would however like to get away from the
shared pending list and the pending list lock entirety and instead
provide a VM downcall to get the pending list. The goal would of
course be to have a more robust way of transferring the pending list
to Java land, instead of today's secret handshake which is easy to
get wrong. Also, not requiring the pending list lock (which is a Java
monitor) to be held during a GC would also simplify things a lot on
the GC side. E.g. the ReferencePendingListLockerThread could be
removed completely.

I’ve been thinking along the same lines.  I think having the pending
list (and associated locking and notification) in Java is just making
life difficult for ourselves, and that things could be much simpler if
that whole protocol was owned by the VM.

Once the reference handler thread has obtained the latest list, if it
then wants to publish that list for other Java threads to help
process, that’s a policy choice that can be explored on the Java side,
with no impact on the VM (including the GC).



If the only blocking/waiting of ReferenceHandler thread was performed by
native code, could it simply ignore Java thread interrupts? If this is
possible, then the problems of InterruptedException allocation and
consequent OutOfMemoryError(s) just disappear.


Yes, blocking in the VM here would ignore thread interrupts and not 
throw InterruptedException.


cheers,
Per


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-23 Thread Peter Levart



On 03/22/2016 10:28 PM, Kim Barrett wrote:

On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
One thing I like about this approach is that it's only the ReferenceHandler 
thread that pops of elements from the pending list and enqueues them. That 
simplifies things a lot.

I like that too.  And hopefully we really can get rid of sun.misc.Cleaner 
(under whatever name).


 From a GC perspective I would however like to get away from the shared pending 
list and the pending list lock entirety and instead provide a VM downcall to 
get the pending list. The goal would of course be to have a more robust way of 
transferring the pending list to Java land, instead of today's secret handshake 
which is easy to get wrong. Also, not requiring the pending list lock (which is 
a Java monitor) to be held during a GC would also simplify things a lot on the 
GC side. E.g. the ReferencePendingListLockerThread could be removed completely.

I’ve been thinking along the same lines.  I think having the pending list (and 
associated locking and notification) in Java is just making life difficult for 
ourselves, and that things could be much simpler if that whole protocol was 
owned by the VM.

Once the reference handler thread has obtained the latest list, if it then 
wants to publish that list for other Java threads to help process, that’s a 
policy choice that can be explored on the Java side, with no impact on the VM 
(including the GC).



If the only blocking/waiting of ReferenceHandler thread was performed by 
native code, could it simply ignore Java thread interrupts? If this is 
possible, then the problems of InterruptedException allocation and 
consequent OutOfMemoryError(s) just disappear.


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Kim Barrett
> On Mar 22, 2016, at 5:24 AM, Per Liden  wrote:
> One thing I like about this approach is that it's only the ReferenceHandler 
> thread that pops of elements from the pending list and enqueues them. That 
> simplifies things a lot.

I like that too.  And hopefully we really can get rid of sun.misc.Cleaner 
(under whatever name).

> From a GC perspective I would however like to get away from the shared 
> pending list and the pending list lock entirety and instead provide a VM 
> downcall to get the pending list. The goal would of course be to have a more 
> robust way of transferring the pending list to Java land, instead of today's 
> secret handshake which is easy to get wrong. Also, not requiring the pending 
> list lock (which is a Java monitor) to be held during a GC would also 
> simplify things a lot on the GC side. E.g. the 
> ReferencePendingListLockerThread could be removed completely.

I’ve been thinking along the same lines.  I think having the pending list (and 
associated locking and notification) in Java is just making life difficult for 
ourselves, and that things could be much simpler if that whole protocol was 
owned by the VM.

Once the reference handler thread has obtained the latest list, if it then 
wants to publish that list for other Java threads to help process, that’s a 
policy choice that can be explored on the Java side, with no impact on the VM 
(including the GC).



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden

Hi Peter,

On 2016-03-21 16:30, Peter Levart wrote:

Hi Per,

May I point you to my proposed change in Reference(Handler) for JDK 9,
being discussed in the thread about JDK-8149925. It will hopefully
remove the special-casing of sun.misc.Cleaner, change the way how
pending references are being enqueued by ReferenceHandler thread and how
other thread(s) can synchronize with it. Since you seem to have a great
knowledge of VM part of things, I would very much like to hear what you
think of that change. Here's the latest webrev:

http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/


(see Reference.java and Bits.java for an example of how this
synchronization with ReferenceHandler thread is to be used)


One thing I like about this approach is that it's only the 
ReferenceHandler thread that pops of elements from the pending list and 
enqueues them. That simplifies things a lot.


From a GC perspective I would however like to get away from the shared 
pending list and the pending list lock entirety and instead provide a VM 
downcall to get the pending list. The goal would of course be to have a 
more robust way of transferring the pending list to Java land, instead 
of today's secret handshake which is easy to get wrong. Also, not 
requiring the pending list lock (which is a Java monitor) to be held 
during a GC would also simplify things a lot on the GC side. E.g. the 
ReferencePendingListLockerThread could be removed completely.


So, I imagine the ReferenceHandler could do something like this:

while (true) {
// getPendingReferences() is a downcall to the VM which
// blocks until the pending list becomes non-empty and
// returns the whole list, transferring it to from VM-land
// to Java-land in a safe and robust way.
Reference pending = getPendingReferences();

// Enqueue the references
while (pending != null) {
Reference r = pending;
pending = r.discovered;
r.discovered = null;
ReferenceQueue q = r.queue;
if (q != ReferenceQueue.NULL) {
q.enqueue(r);
}
}
}

I haven't thought through the details when it comes having additional 
Java threads helping out with Cleaners. The ReferenceHandler would be 
free to use whatever lists/locks is wants to handle this and the GC 
wouldn't know anything about it. But, with the above approach at least 
the interface between the ReferenceHandler and the VM would be pretty 
clear and hard(er) to misuse.


cheers,
Per



Regards, Peter

On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner
will go away in JDK9, so all this speculation about the source of
OOME(s) can be put to rest. But for JDK 8u, I agree that this should
be sorted out.

My feeling is that (instanceof Cleaner) can not result in allocation
and therefore can not trigger OOME if the Cleaner class is already
loaded at that time. I think that we were chasing the wrong rabbit. As
I have found later, there is a much more probable cause for
ReferenceHandler thread dying with OOME after the fix to catch OOME
from lock.wait(). It is triggered by the invocation of Cleaner.clean()
later down in the code. I even created a reproducer for it. See my
last two comments of the following issue:

https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof
Cleaner) checks outside the synchronized block and in addition, find a
way to handle the OOME thrown from Cleaner.clean().

What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me
like we
have a bug 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden

On 2016-03-21 18:32, Kim Barrett wrote:

On Mar 21, 2016, at 8:20 AM, Per Liden  wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/



I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed down :)


While investigating a Reference pending list issue on the GC side of things I looked at the 
ReferenceHandler thread and noticed something which made me uneasy. The fix for JDK-8022321 added 
pre-loading of the Cleaer class to avoid OMME, but also moved the "instanceof Cleaner" 
inside the try/catch with a comment that it "sometimes" can throw an OOME. I understand 
this was done because we're not 100% sure if a OOME can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in turn means it can provoke a 
GC. If that happens, it looks to me like we have a bug here. The ReferenceHandler thread is not 
allowed to provoke a GC while it's holding on to the pending list lock, since the pending list 
might be updated during a GC and "pending = r.discovered" will than overwrite something 
other than "r", silently dropping any newly discovered References which will never be 
discovered by the the GC again.

On the other hand, if an OOME can never happen (i.e. no GC) here then we're 
good the comment is just incorrect. The instanceof check could be moved out of 
the try/catch block again, like it was prior to this change, just to make it 
obvious that we will not be able to cause new allocations inside the critical 
section. Or at a minimum, the comment saying OOME can still happen should be 
adjusted.

Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any 
GC activity from instanceof, but I can't say that am a 100% sure either.


Per - I think you are raising the same issue as discussed in 
https://bugs.openjdk.java.net/browse/JDK-8055232.


Ah, thanks Kim for pointing that out.

cheers,
Per


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-22 Thread Per Liden

Hi Peter,

On 2016-03-21 16:13, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner will
go away in JDK9, so all this speculation about the source of OOME(s) can
be put to rest. But for JDK 8u, I agree that this should be sorted out.

My feeling is that (instanceof Cleaner) can not result in allocation and
therefore can not trigger OOME if the Cleaner class is already loaded at
that time. I think that we were chasing the wrong rabbit. As I have
found later, there is a much more probable cause for ReferenceHandler
thread dying with OOME after the fix to catch OOME from lock.wait(). It
is triggered by the invocation of Cleaner.clean() later down in the
code. I even created a reproducer for it. See my last two comments of
the following issue:

https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof
Cleaner) checks outside the synchronized block and in addition, find a
way to handle the OOME thrown from Cleaner.clean().

What do you think?


That sound good to me. With the addition of the try/catch around 
Cleaner.clean() catching not just OOME, but all Throwables, right?


cheers,
Per




Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to
provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will
than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have
unintentionally introduced a new one. Prior to JDK-8022321 we knew
that the ReferenceHandler couldn't provoke a GC while manipulating the
pending list, since the code was:

synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly
rules and handshakes between the GC and the ReferenceHandler, which
only works because we control of both.



How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM
operation, on behalf of the GC to allow the GC the manipulate the
pending list. If the thread issuing the VM operation is the
ReferenceHandler, then the monitor is taken recursively, which is ok
as long as ReferenceHandler isn't in the middle of unlinking an element.




On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes

On 22/03/2016 3:32 AM, Kim Barrett wrote:

On Mar 21, 2016, at 8:20 AM, Per Liden  wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/



I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed down :)


While investigating a Reference pending list issue on the GC side of things I looked at the 
ReferenceHandler thread and noticed something which made me uneasy. The fix for JDK-8022321 added 
pre-loading of the Cleaer class to avoid OMME, but also moved the "instanceof Cleaner" 
inside the try/catch with a comment that it "sometimes" can throw an OOME. I understand 
this was done because we're not 100% sure if a OOME can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in turn means it can provoke a 
GC. If that happens, it looks to me like we have a bug here. The ReferenceHandler thread is not 
allowed to provoke a GC while it's holding on to the pending list lock, since the pending list 
might be updated during a GC and "pending = r.discovered" will than overwrite something 
other than "r", silently dropping any newly discovered References which will never be 
discovered by the the GC again.

On the other hand, if an OOME can never happen (i.e. no GC) here then we're 
good the comment is just incorrect. The instanceof check could be moved out of 
the try/catch block again, like it was prior to this change, just to make it 
obvious that we will not be able to cause new allocations inside the critical 
section. Or at a minimum, the comment saying OOME can still happen should be 
adjusted.

Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any 
GC activity from instanceof, but I can't say that am a 100% sure either.


Per - I think you are raising the same issue as discussed in 
https://bugs.openjdk.java.net/browse/JDK-8055232.


That bug somehow escaped my notice as well. :(

Thanks,
David
-






Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes



On 21/03/2016 11:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/






I can live with it, though it maybe that once Cleaner has been
preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have
unintentionally introduced a new one. Prior to JDK-8022321 we knew that
the ReferenceHandler couldn't provoke a GC while manipulating the
pending list, since the code was:

synchronized (lock) {
 if (pending != null) {
 r = pending;
 pending = r.discovered;
 r.discovered = null;
 } else {
 
 }
}


Except that it actually could if the wait() in the else part was 
interrupted. But yes the move of instanceof did add another potential 
allocation point (as follow up bugs showed) but the pre-loading does 
seem to have addressed that (though perhaps not with 100% certainty).



The manipulation of the pending list is built on some secret/ugly rules
and handshakes between the GC and the ReferenceHandler, which only works
because we control of both.


Unfortunately implicit allocation was not given enough consideration. 
Which really makes me concerned about the possibility of this code being 
JIT-compiled by a Java compiler under JVMCI!




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM
operation, on behalf of the GC to allow the GC the manipulate the
pending list. If the thread issuing the VM operation is the
ReferenceHandler, then the monitor is taken recursively, which is ok as
long as ReferenceHandler isn't in the middle of unlinking an element.


Ah I see.

Thanks,
David
-




On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Kim Barrett
> On Mar 21, 2016, at 8:20 AM, Per Liden  wrote:
> 
> Hi Peter & David,
> 
> (Resurrecting an old thread here...)
> 
> On 2014-01-22 03:19, David Holmes wrote:
>> Hi Peter,
>> 
>> On 22/01/2014 12:00 AM, Peter Levart wrote:
>>> Hi, David, Kalyan,
>>> 
>>> Summing up the discussion, I propose the following patch for
>>> ReferenceHandler:
>>> 
>>> http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/
>>> 
>> 
>> I can live with it, though it maybe that once Cleaner has been preloaded
>> instanceof can no longer throw OOME. Can't be 100% sure. And there's
>> some duplication/verbosity in the commentary that could be trimmed down :)
> 
> While investigating a Reference pending list issue on the GC side of things I 
> looked at the ReferenceHandler thread and noticed something which made me 
> uneasy. The fix for JDK-8022321 added pre-loading of the Cleaer class to 
> avoid OMME, but also moved the "instanceof Cleaner" inside the try/catch with 
> a comment that it "sometimes" can throw an OOME. I understand this was done 
> because we're not 100% sure if a OOME can still happen here, despite the 
> pre-loading.
> 
> However, if it can throw an OOME that means it's allocating, which in turn 
> means it can provoke a GC. If that happens, it looks to me like we have a bug 
> here. The ReferenceHandler thread is not allowed to provoke a GC while it's 
> holding on to the pending list lock, since the pending list might be updated 
> during a GC and "pending = r.discovered" will than overwrite something other 
> than "r", silently dropping any newly discovered References which will never 
> be discovered by the the GC again.
> 
> On the other hand, if an OOME can never happen (i.e. no GC) here then we're 
> good the comment is just incorrect. The instanceof check could be moved out 
> of the try/catch block again, like it was prior to this change, just to make 
> it obvious that we will not be able to cause new allocations inside the 
> critical section. Or at a minimum, the comment saying OOME can still happen 
> should be adjusted.
> 
> Thoughts?
> 
> thanks,
> Per
> 
> Btw, to the best of my knowledge, the pre-loading of Cleaner should avoid any 
> GC activity from instanceof, but I can't say that am a 100% sure either.

Per - I think you are raising the same issue as discussed in 
https://bugs.openjdk.java.net/browse/JDK-8055232.




Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart



On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner 
will go away in JDK9, so all this speculation about the source of 
OOME(s) can be put to rest. But for JDK 8u, I agree that this should 
be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation 
and therefore can not trigger OOME if the Cleaner class is already 
loaded at that time. I think that we were chasing the wrong rabbit. As 
I have found later, there is a much more probable cause for 
ReferenceHandler thread dying with OOME after the fix to catch OOME 
from lock.wait(). It is triggered by the invocation of Cleaner.clean() 
later down in the code. I even created a reproducer for it. See my 
last two comments of the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


OTOH, If you are not 100% sure about instanceof doing allocation, then a 
simple fix would be to re-check the 'pending' field if it still points 
to the same object as before instanceof check:



synchronized (lock) {
while ((r = pending) != null) {
// 'instanceof' might throw OutOfMemoryError sometimes
// so do this before un-linking 'r' from the 
'pending' chain...

c = r instanceof Cleaner ? (Cleaner) r : null;
// unlink 'r' from 'pending' chain if it is still 
the same as before
// 'instanceof' check which might have triggered GC 
and GC might
// have discovered some more references and hooked 
them on

// the pending list...
if (pending == r) {
pending = r.discovered;
r.discovered = null;
break;
}
}
if (r == null) {
// The waiting on the lock may cause an 
OutOfMemoryError

// because it may try to allocate exception objects.
if (waitForNotify) {
lock.wait();
}
// retry if waited
return waitForNotify;
}
}


Regards, Peter




On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me 
like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java 
code to

indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating 
the pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart

Hi Per,

May I point you to my proposed change in Reference(Handler) for JDK 9, 
being discussed in the thread about JDK-8149925. It will hopefully 
remove the special-casing of sun.misc.Cleaner, change the way how 
pending references are being enqueued by ReferenceHandler thread and how 
other thread(s) can synchronize with it. Since you seem to have a great 
knowledge of VM part of things, I would very much like to hear what you 
think of that change. Here's the latest webrev:


http://cr.openjdk.java.net/~plevart/jdk9-dev/removeInternalCleaner/webrev.08.part2/

(see Reference.java and Bits.java for an example of how this 
synchronization with ReferenceHandler thread is to be used)


Regards, Peter

On 03/21/2016 04:13 PM, Peter Levart wrote:

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner 
will go away in JDK9, so all this speculation about the source of 
OOME(s) can be put to rest. But for JDK 8u, I agree that this should 
be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation 
and therefore can not trigger OOME if the Cleaner class is already 
loaded at that time. I think that we were chasing the wrong rabbit. As 
I have found later, there is a much more probable cause for 
ReferenceHandler thread dying with OOME after the fix to catch OOME 
from lock.wait(). It is triggered by the invocation of Cleaner.clean() 
later down in the code. I even created a reproducer for it. See my 
last two comments of the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me 
like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java 
code to

indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating 
the pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly 
rules and handshakes between the GC and the ReferenceHandler, which 
only works because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread 
holds

the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok 
as long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Peter Levart

Hi Per, David,

As things stand, there is a very good chance that sun.misc.Cleaner will 
go away in JDK9, so all this speculation about the source of OOME(s) can 
be put to rest. But for JDK 8u, I agree that this should be sorted out.


My feeling is that (instanceof Cleaner) can not result in allocation and 
therefore can not trigger OOME if the Cleaner class is already loaded at 
that time. I think that we were chasing the wrong rabbit. As I have 
found later, there is a much more probable cause for ReferenceHandler 
thread dying with OOME after the fix to catch OOME from lock.wait(). It 
is triggered by the invocation of Cleaner.clean() later down in the 
code. I even created a reproducer for it. See my last two comments of 
the following issue:


https://bugs.openjdk.java.net/browse/JDK-8066859

(but don't look at the proposed fix since it is not very good)


I think that for JDK 8u we could revert the code and do (instanceof 
Cleaner) checks outside the synchronized block and in addition, find a 
way to handle the OOME thrown from Cleaner.clean().


What do you think?


Regards, Peter


On 03/21/2016 02:41 PM, Per Liden wrote:

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 







I can live with it, though it maybe that once Cleaner has been 
preloaded

instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to 
provoke a

GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will 
than

overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC 
again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew 
that the ReferenceHandler couldn't provoke a GC while manipulating the 
pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly 
rules and handshakes between the GC and the ReferenceHandler, which 
only works because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok 
as long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Per Liden

Hi David,

On 2016-03-21 13:49, David Holmes wrote:

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/





I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC again.


Then the code was completely broken because it was obviously capable of
allocating whilst holding the lock. There is nothing in the Java code to
indicate allocation should not happen and no way that Java code can
directly control that! We were only fixing the problem of the exception
killing the thread, not trying to address an undisclosed illegal
allocation problem!


JDK-8022321 did indeed fix a real issue. It might also have 
unintentionally introduced a new one. Prior to JDK-8022321 we knew that 
the ReferenceHandler couldn't provoke a GC while manipulating the 
pending list, since the code was:


synchronized (lock) {
if (pending != null) {
r = pending;
pending = r.discovered;
r.discovered = null;
} else {

}
}

The manipulation of the pending list is built on some secret/ugly rules 
and handshakes between the GC and the ReferenceHandler, which only works 
because we control of both.




How would a GC thread update pending if the ReferenceHandlerThread holds
the lock?


The pending list lock is grabbed by the Java thread issuing the VM 
operation, on behalf of the GC to allow the GC the manipulate the 
pending list. If the thread issuing the VM operation is the 
ReferenceHandler, then the monitor is taken recursively, which is ok as 
long as ReferenceHandler isn't in the middle of unlinking an element.





On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or
not the instanceof could ever trigger an allocation and hence
potentially an OOME.


I agree, it's not obvious.

cheers,
Per



With JVMCI it is now easier to imagine that compilation of this code by
a JVMCI compiler might lead to allocation while the lock is held!

Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread David Holmes

Hi Per,

On 21/03/2016 10:20 PM, Per Liden wrote:

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/




I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed
down :)


While investigating a Reference pending list issue on the GC side of
things I looked at the ReferenceHandler thread and noticed something
which made me uneasy. The fix for JDK-8022321 added pre-loading of the
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner"
inside the try/catch with a comment that it "sometimes" can throw an
OOME. I understand this was done because we're not 100% sure if a OOME
can still happen here, despite the pre-loading.

However, if it can throw an OOME that means it's allocating, which in
turn means it can provoke a GC. If that happens, it looks to me like we
have a bug here. The ReferenceHandler thread is not allowed to provoke a
GC while it's holding on to the pending list lock, since the pending
list might be updated during a GC and "pending = r.discovered" will than
overwrite something other than "r", silently dropping any newly
discovered References which will never be discovered by the the GC again.


Then the code was completely broken because it was obviously capable of 
allocating whilst holding the lock. There is nothing in the Java code to 
indicate allocation should not happen and no way that Java code can 
directly control that! We were only fixing the problem of the exception 
killing the thread, not trying to address an undisclosed illegal 
allocation problem!


How would a GC thread update pending if the ReferenceHandlerThread holds 
the lock?



On the other hand, if an OOME can never happen (i.e. no GC) here then
we're good the comment is just incorrect. The instanceof check could be
moved out of the try/catch block again, like it was prior to this
change, just to make it obvious that we will not be able to cause new
allocations inside the critical section. Or at a minimum, the comment
saying OOME can still happen should be adjusted.


I found it very difficult to determine with 100% certainty whether or 
not the instanceof could ever trigger an allocation and hence 
potentially an OOME.


With JVMCI it is now easier to imagine that compilation of this code by 
a JVMCI compiler might lead to allocation while the lock is held!


Cheers,
David


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should
avoid any GC activity from instanceof, but I can't say that am a 100%
sure either.



Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be
good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2016-03-21 Thread Per Liden

Hi Peter & David,

(Resurrecting an old thread here...)

On 2014-01-22 03:19, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/



I can live with it, though it maybe that once Cleaner has been preloaded
instanceof can no longer throw OOME. Can't be 100% sure. And there's
some duplication/verbosity in the commentary that could be trimmed down :)


While investigating a Reference pending list issue on the GC side of 
things I looked at the ReferenceHandler thread and noticed something 
which made me uneasy. The fix for JDK-8022321 added pre-loading of the 
Cleaer class to avoid OMME, but also moved the "instanceof Cleaner" 
inside the try/catch with a comment that it "sometimes" can throw an 
OOME. I understand this was done because we're not 100% sure if a OOME 
can still happen here, despite the pre-loading.


However, if it can throw an OOME that means it's allocating, which in 
turn means it can provoke a GC. If that happens, it looks to me like we 
have a bug here. The ReferenceHandler thread is not allowed to provoke a 
GC while it's holding on to the pending list lock, since the pending 
list might be updated during a GC and "pending = r.discovered" will than 
overwrite something other than "r", silently dropping any newly 
discovered References which will never be discovered by the the GC again.


On the other hand, if an OOME can never happen (i.e. no GC) here then 
we're good the comment is just incorrect. The instanceof check could be 
moved out of the try/catch block again, like it was prior to this 
change, just to make it obvious that we will not be able to cause new 
allocations inside the critical section. Or at a minimum, the comment 
saying OOME can still happen should be adjusted.


Thoughts?

thanks,
Per

Btw, to the best of my knowledge, the pre-loading of Cleaner should 
avoid any GC activity from instanceof, but I can't say that am a 100% 
sure either.




Any specific reason to use Unsafe to do the preload rather than
Class.forName ? Does this force Unsafe to be loaded earlier than it
otherwise would?

Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Alan Bateman

On 29/01/2014 19:10, Mandy Chung wrote:


On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles in 
both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super 
Object for now as there is no supertype for Object and this looks a 
little odd.  We can clean this up as a separate fix after we get 
clarification from compiler-dev.
I see Peter has posted a question to compiler-dev on this and it can 
always be re-visited once it clear why it compiles when both Reference 
and ReferenceQueue are in the same compilation unit.


-Alan


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Peter Levart

On 01/30/2014 03:46 PM, Alan Bateman wrote:

On 29/01/2014 19:10, Mandy Chung wrote:


On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles 
in both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super 
Object for now as there is no supertype for Object and this looks a 
little odd.  We can clean this up as a separate fix after we get 
clarification from compiler-dev.
I see Peter has posted a question to compiler-dev on this and it can 
always be re-visited once it clear why it compiles when both Reference 
and ReferenceQueue are in the same compilation unit.


-Alan


I Just commited the version with no change to ReferenceQueueObject 
line to jdk9/dev. If there is a bug in javac and the code would not 
compile as is, the change to this line should be committed as part of 
javac fix, right?


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-30 Thread Alan Bateman

On 30/01/2014 14:51, Peter Levart wrote:


I Just commited the version with no change to ReferenceQueueObject 
line to jdk9/dev. If there is a bug in javac and the code would not 
compile as is, the change to this line should be committed as part of 
javac fix, right?


It's good to get this change in. If javac were to be changed to reject 
this code then it need to be changed at the same time (but I guess we 
wait to see if this is case as it's just not obvious yet).


-Alan



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-29 Thread Peter Levart

On 01/28/2014 04:46 PM, Alan Bateman wrote:

On 28/01/2014 08:44, Peter Levart wrote:


Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when 
building via OpenJDK build system make files (using make images in 
top directory for example) but only if I compile the class from 
command line (using javac directly) or from IDEA. I use JDK 8 ea-b121 
in all cases as a build JDK. Are there any special options passed to 
javac for compiling those classes in JDK build system that allow such 
code?


jdk/make/Setup.gmk has the -Xlint options that are used in the build 
but I suspect it more than that all the classes in java/lang/ref are 
compiled together.


-Alan


That's right. If I add the source for ReferenceQueue.java into a 
directory where Reference.java resides and then compile with:


javac -d /tmp Reference.java

...then Reference as well as ReferenceQueue gets compiled and there's no 
error. If there is sole Reference.java in the directory, a compile time 
error is emitted. I checked the source of ReferenceQueue.java in JDK 8 
ea-b121 (the JDK used for compiling) and it only differs in copyright 
year from the source in jdk9-dev. So there seems to be inconsistency in 
javac's handling of types that are read from .class vs. .java files.


I'll try to create a reproducer example and post it to compiler-dev.

Since I don't know what should be the correct behaviour of javac, I can 
leave the Reference.java changes as proposed since it compiles in both 
cases. Or should I revert the change to declaration of local variable 'q' ?


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-29 Thread Mandy Chung


On 1/29/2014 5:09 AM, Peter Levart wrote:


Since I don't know what should be the correct behaviour of javac, I 
can leave the Reference.java changes as proposed since it compiles in 
both cases. Or should I revert the change to declaration of local 
variable 'q' ? 


I slightly prefer to revert the change to ReferenceQueue? super Object 
for now as there is no supertype for Object and this looks a little 
odd.  We can clean this up as a separate fix after we get clarification 
from compiler-dev.


Mandy


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-28 Thread Peter Levart

On 01/28/2014 03:17 AM, David Holmes wrote:

On 27/01/2014 5:07 AM, Peter Levart wrote:


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:

Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/


I changed into using Class.forName() instead of Unsafe for class
preloading and initialization just to be on the safe side regarding
unwanted premature initialization of Unsafe class. I also took the
liberty of removing an unneeded semicolon (line 114) and fixing a JDK 8
compile time error in generics (line 189):

 incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object


Seems somewhat odd given there is no supertype for Object but it is 
consistent with the field declaration:


ReferenceQueue? super T queue;

The generics here is a little odd as we don't really know the type of 
T we just play fast-and-loose by declaring:


ReferenceObject r;

Which only works because of erasure. I guess it wouldn't work to try 
and use a simple wildcard '?' for both 'r' and 'q' as they would be 
different captures to javac.


Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when building 
via OpenJDK build system make files (using make images in top 
directory for example) but only if I compile the class from command line 
(using javac directly) or from IDEA. I use JDK 8 ea-b121 in all cases as 
a build JDK. Are there any special options passed to javac for compiling 
those classes in JDK build system that allow such code?


Regards, Peter




I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David,
I'll commit this to jdk9/dev...


I can be counted as the Reviewer. Kalyan can be listed as a reviewer.

Thanks Peter.

David
-


Regards, Peter


--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the
change was eliminate double reading of 'pending' field. I have a
mental model of local variable being a register and field being a
memory location. This may be important if the field is volatile, but
for normal fields, I guess the optimizer knows how to compile such
code most optimally in either case. The old (your) version is better
from logical perspective, since it guarantees that dereferencing the
'r', wherever it is possible, will never throw NPE (dereferencing
where 'r' is not assigned is not possible because of definitive
assignment rules). So I support going back to your version...

Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be
assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 



. I ran a 1000 run and no failures so far, however i would like to
run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in 
native

memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there
be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and 
we see
OOME on JDK8 too). Each time a class is loaded, new 
java.lang.Class

object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will
give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-28 Thread Alan Bateman

On 28/01/2014 08:44, Peter Levart wrote:


Yes, I tried that too and it results in even more unsafe casts.

It's odd yes, since the compile-time error is not present when 
building via OpenJDK build system make files (using make images in 
top directory for example) but only if I compile the class from 
command line (using javac directly) or from IDEA. I use JDK 8 ea-b121 
in all cases as a build JDK. Are there any special options passed to 
javac for compiling those classes in JDK build system that allow such 
code?


jdk/make/Setup.gmk has the -Xlint options that are used in the build but 
I suspect it more than that all the classes in java/lang/ref are 
compiled together.


-Alan


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-27 Thread David Holmes

On 27/01/2014 5:07 AM, Peter Levart wrote:


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:

Hi Peter, if you are a committer would you like to take this further
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review
comments:

cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/


I changed into using Class.forName() instead of Unsafe for class
preloading and initialization just to be on the safe side regarding
unwanted premature initialization of Unsafe class. I also took the
liberty of removing an unneeded semicolon (line 114) and fixing a JDK 8
compile time error in generics (line 189):

 incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ?
super java.lang.Object cannot be converted to
java.lang.ref.ReferenceQueuejava.lang.Object


Seems somewhat odd given there is no supertype for Object but it is 
consistent with the field declaration:


ReferenceQueue? super T queue;

The generics here is a little odd as we don't really know the type of T 
we just play fast-and-loose by declaring:


ReferenceObject r;

Which only works because of erasure. I guess it wouldn't work to try and 
use a simple wildcard '?' for both 'r' and 'q' as they would be 
different captures to javac.



I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David,
I'll commit this to jdk9/dev...


I can be counted as the Reviewer. Kalyan can be listed as a reviewer.

Thanks Peter.

David
-


Regards, Peter


--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the
change was eliminate double reading of 'pending' field. I have a
mental model of local variable being a register and field being a
memory location. This may be important if the field is volatile, but
for normal fields, I guess the optimizer knows how to compile such
code most optimally in either case. The old (your) version is better
from logical perspective, since it guarantees that dereferencing the
'r', wherever it is possible, will never throw NPE (dereferencing
where 'r' is not assigned is not possible because of definitive
assignment rules). So I support going back to your version...

Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be
assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait()
and then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to
run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there
be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will
give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone
data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter













Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-27 Thread Mandy Chung


On 1/26/2014 11:07 AM, Peter Levart wrote:


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further 
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review 
comments:


cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/



This looks good to me.  Sorry I have been behind in following the 
discussion of this thread.  It's good to see this problem be diagnosed 
and fixed (thank you all).


I also prefer using Class.forName to do the preloading and initialization.

Mandy


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-26 Thread Peter Levart


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further 
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review 
comments:


cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/


I changed into using Class.forName() instead of Unsafe for class 
preloading and initialization just to be on the safe side regarding 
unwanted premature initialization of Unsafe class. I also took the 
liberty of removing an unneeded semicolon (line 114) and fixing a JDK 8 
compile time error in generics (line 189):


incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ? 
super java.lang.Object cannot be converted to 
java.lang.ref.ReferenceQueuejava.lang.Object


I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from David, 
I'll commit this to jdk9/dev...


Regards, Peter


--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid 
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the 
change was eliminate double reading of 'pending' field. I have a 
mental model of local variable being a register and field being a 
memory location. This may be important if the field is volatile, but 
for normal fields, I guess the optimizer knows how to compile such 
code most optimally in either case. The old (your) version is better 
from logical perspective, since it guarantees that dereferencing the 
'r', wherever it is possible, will never throw NPE (dereferencing 
where 'r' is not assigned is not possible because of definitive 
assignment rules). So I support going back to your version...


Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be 
assigned

pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() 
and then continue - back to the top of the loop. There is no bug in 
Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 

. I ran a 1000 run and no failures so far, however i would like to 
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there 
be an

OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will 
give

the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone 
data

at java.home/lib/zi with JSR310's tzdb


Regards, Peter













Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-26 Thread srikalyan chandrashekar

On 1/26/14 11:07 AM, Peter Levart wrote:


On 01/25/2014 05:35 AM, srikalyan chandrashekar wrote:
Hi Peter, if you are a committer would you like to take this further 
(OR) perhaps david could sponsor this change.


Hi,

Here's new webrev that takes into account Kaylan's and David's review 
comments:


cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.02/


I changed into using Class.forName() instead of Unsafe for class 
preloading and initialization just to be on the safe side regarding 
unwanted premature initialization of Unsafe class. I also took the 
liberty of removing an unneeded semicolon (line 114) and fixing a JDK 
8 compile time error in generics (line 189):


incompatible types: java.lang.ref.ReferenceQueuecapture#1 of ? 
super java.lang.Object cannot be converted to 
java.lang.ref.ReferenceQueuejava.lang.Object


I re-ran the java/lang/ref tests and they pass.

Can I count you as a reviewer, Kalyan? If I get a go also from 
David, I'll commit this to jdk9/dev...
Hi Peter, I do not have review rights. So it has to be someone else from 
core-libs-dev.


Regards, Peter


--
Thanks
kalyan




--
Thanks
kalyan
On 1/24/14 4:05 PM, Peter Levart wrote:


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid 
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the 
change was eliminate double reading of 'pending' field. I have a 
mental model of local variable being a register and field being a 
memory location. This may be important if the field is volatile, but 
for normal fields, I guess the optimizer knows how to compile such 
code most optimally in either case. The old (your) version is better 
from logical perspective, since it guarantees that dereferencing the 
'r', wherever it is possible, will never throw NPE (dereferencing 
where 'r' is not assigned is not possible because of definitive 
assignment rules). So I support going back to your version...


Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be 
assigned

pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() 
and then continue - back to the top of the loop. There is no bug 
in Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 

. I ran a 1000 run and no failures so far, however i would like 
to run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there 
be an

OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we 
see

OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter

Please correct if i am missing something here. Meanwhile i will 
give

the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk 
timezone data

at java.home/lib/zi with JSR310's tzdb


Regards, Peter















Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread Peter Levart


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid 
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the change 
was eliminate double reading of 'pending' field. I have a mental model 
of local variable being a register and field being a memory location. 
This may be important if the field is volatile, but for normal fields, I 
guess the optimizer knows how to compile such code most optimally in 
either case. The old (your) version is better from logical perspective, 
since it guarantees that dereferencing the 'r', wherever it is possible, 
will never throw NPE (dereferencing where 'r' is not assigned is not 
possible because of definitive assignment rules). So I support going 
back to your version...


Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be 
assigned

pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and 
then continue - back to the top of the loop. There is no bug in 
Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 

. I ran a 1000 run and no failures so far, however i would like to 
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter









Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread Peter Levart


On 01/22/2014 03:19 AM, David Holmes wrote:

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/ 



I can live with it, though it maybe that once Cleaner has been 
preloaded instanceof can no longer throw OOME. Can't be 100% sure. And 
there's some duplication/verbosity in the commentary that could be 
trimmed down :)


Any specific reason to use Unsafe to do the preload rather than 
Class.forName ? Does this force Unsafe to be loaded earlier than it 
otherwise would?


Good question. In systemDictionary.hpp they are both on the preloaded 
list in this order:


  do_klass(Reference_klass, java_lang_ref_Reference,   
Pre ) \

...
  do_klass(misc_Unsafe_klass, 
sun_misc_Unsafe,   Pre ) \



So when Reference is initialized, the Unsafe is already loaded. But I 
don't know if it is already initialized. This should be studied.


I'll try to find out what is the case and get back to you.

Regards, Peter




Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be 
good

to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, 
nothing

to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is 
rised a

consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility 
that we
spin, consume too much CPU (ReferenceHandler thread has MAX 
priority) so
that other threads dont' get enough CPU time to proceed and clean 
things
up (we hope other threads will also get OOME and release things as 
their

stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown
after GC fails to create space, so it really needs some other thread
to drop live references to allow further space to be reclaimed.

But note that Thread.yield() can behave badly on some linux systems
too, so spinning is still a possibility - but either way this would
only be really bad on a uniprocessor system where yield() is
unlikely to misbehave.

David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-24 Thread srikalyan chandrashekar
Hi Peter, if you are a committer would you like to take this further 
(OR) perhaps david could sponsor this change.


--
Thanks
kalyan

On 1/24/14 4:05 PM, Peter Levart wrote:


On 01/24/2014 02:53 AM, srikalyan chandrashekar wrote:
Hi David, yes thats right, only benefit i see is we can avoid 
assignment to 'r' if pending is null.


Hi Kalyan,

Good to hear that test runs without failures so far.

Regarding assignment of 'r'. What I tried to accomplish with the 
change was eliminate double reading of 'pending' field. I have a 
mental model of local variable being a register and field being a 
memory location. This may be important if the field is volatile, but 
for normal fields, I guess the optimizer knows how to compile such 
code most optimally in either case. The old (your) version is better 
from logical perspective, since it guarantees that dereferencing the 
'r', wherever it is possible, will never throw NPE (dereferencing 
where 'r' is not assigned is not possible because of definitive 
assignment rules). So I support going back to your version...


Regards, Peter



--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be 
assigned

pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() 
and then continue - back to the top of the loop. There is no bug in 
Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 

. I ran a 1000 run and no failures so far, however i would like to 
run a

couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there 
be an

OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone 
data

at java.home/lib/zi with JSR310's tzdb


Regards, Peter











Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
 ..


  TO


if (pending != null) {
 r = pending;

This is because the r is used later in the code and must not be assigned 
pending unless it is not null(this was as is earlier). The new webrev is 
posted here 
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 
. I ran a 1000 run and no failures so far, however i would like to run a 
couple more 1000 runs to assert the fix.


PS: The description section of JEP-122 
(http://openjdk.java.net/jeps/122) says meta-data would be in native 
memory(not heap).


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule 
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan

Hi Peter/David, we have 2000 runs without a single failure.

--
Thanks
kalyan
Ph: (408)-585-8040


On 1/23/14, 12:10 PM, srikalyan wrote:

Hi Peter, i have modified your code from
r = pending;
if (r != null) {
  ..


   TO


if (pending != null) {
  r = pending;
This is because the r is used later in the code and must not be 
assigned pending unless it is not null(this was as is earlier). The 
new webrev is posted here 
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 
. I ran a 1000 run and no failures so far, however i would like to run 
a couple more 1000 runs to assert the fix.


PS: The description section of JEP-122 
(http://openjdk.java.net/jeps/122) says meta-data would be in native 
memory(not heap).

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be 
an OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule 
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone 
data at java.home/lib/zi with JSR310's tzdb



Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread David Holmes

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and 
then continue - back to the top of the loop. There is no bug in Peter's 
code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/
. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread srikalyan chandrashekar
Hi David, yes thats right, only benefit i see is we can avoid assignment 
to 'r' if pending is null.


--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and 
then continue - back to the top of the loop. There is no bug in 
Peter's code.


The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/ 


. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-23 Thread David Holmes

On 24/01/2014 11:53 AM, srikalyan chandrashekar wrote:

Hi David, yes thats right, only benefit i see is we can avoid assignment
to 'r' if pending is null.


I'm okay with either version.

David


--
Thanks
kalyan

On 1/23/14 4:33 PM, David Holmes wrote:

On 24/01/2014 6:10 AM, srikalyan wrote:

Hi Peter, i have modified your code from

r = pending;
if (r != null) {
  ..
   TO
if (pending != null) {
  r = pending;

This is because the r is used later in the code and must not be assigned
pending unless it is not null(this was as is earlier).


If r is null, because pending is null then you perform the wait() and
then continue - back to the top of the loop. There is no bug in
Peter's code.

The new webrev is

posted here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev-V2/

. I ran a 1000 run and no failures so far, however i would like to run a
couple more 1000 runs to assert the fix.

PS: The description section of JEP-122
(http://openjdk.java.net/jeps/122) says meta-data would be in native
memory(not heap).


The class_mirror is a Java object not meta-data.

David


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 2:31 PM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.

Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule
from /home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread David Holmes

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility that we
spin, consume too much CPU (ReferenceHandler thread has MAX priority) so
that other threads dont' get enough CPU time to proceed and clean things
up (we hope other threads will also get OOME and release things as their
stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown 
after GC fails to create space, so it really needs some other thread to 
drop live references to allow further space to be reclaimed.


But note that Thread.yield() can behave badly on some linux systems too, 
so spinning is still a possibility - but either way this would only be 
really bad on a uniprocessor system where yield() is unlikely to 
misbehave.


David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing
OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in this
situation. And will never be doing it again after the heap is
freed...
So it might be good to load and initialize Cleaner class as part of
ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap
while
loading and initializing the Cleaner class.


Yes - I was assuming that this would not fail the very first time and
so the Cleaner class would already be loaded. Failing to be able to
load the Cleaner class was one of the potential issues flagged
earlier with this problem. I was actually assuming that Cleaner would
be loaded already due to some actual Cleaner subclasses being used,
but this does not happen as part of the default initialization. :(
The irony being that if the Cleaner class is not 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread Peter Levart

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for 
ReferenceHandler:


http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/

all 10 java/lang/ref tests pass on my PC (including OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test 
with this code and report any failure.



Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for 
InterruptedException too.



Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility that we
spin, consume too much CPU (ReferenceHandler thread has MAX priority) so
that other threads dont' get enough CPU time to proceed and clean things
up (we hope other threads will also get OOME and release things as their
stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown 
after GC fails to create space, so it really needs some other thread 
to drop live references to allow further space to be reclaimed.


But note that Thread.yield() can behave badly on some linux systems 
too, so spinning is still a possibility - but either way this would 
only be really bad on a uniprocessor system where yield() is 
unlikely to misbehave.


David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner 
throwing

OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in 
this

situation. And will never be doing it again after the heap is
freed...
So it might be good to load and initialize Cleaner class as 
part of

ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap
while
loading and initializing the Cleaner class.


Yes - I was assuming 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread Peter Levart

On 01/21/2014 08:57 AM, David Holmes wrote:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ?? 


The Cleaner is usually loaded by ReferenceHandler in JDK8 in the 1st 
execution of it's loop. It looks like JDK8 system initialization 
produces at least one XXXReference that is cleared before main() method 
is entered (debugging, I found it's a Finalizer for a FileInputStream - 
perhaps of the stream that loads the TimeZone data), so ReferenceHandler 
thread is woken-up, executes the instanceof Cleaner check and this loads 
the class. I put the following printfs in an original ReferenceHandler:


System.out.println(Before using Cleaner...);
// Fast path for cleaners
if (r instanceof Cleaner) {
((Cleaner)r).clean();
continue;
}
System.out.println(After using Cleaner...);


...and the empty main() test with -verbose:class prints:

...
[Loaded java.io.DataInput from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]

*Before using Cleaner...**
**[Loaded sun.misc.Cleaner from out/production/jdk]**
**After using Cleaner...*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/work/hg/jdk8-tl/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/rt.jar]


...


But sometimes, It seems, the VM is not so quick in clearing the early 
XXXReferences and/or the ReferenceHandler start-up is delayed and the 
1st iteration of the loop is executed after the OOMEInReferenceHandler 
test already fills the heap and consequently loading of Cleaner class 
throws OOME in instanceof check...


My proposed fix is very aggressive. It pre-loads classes, initializes 
them and watches for OOMEs thrown in all ocasions. It might be that 
pre-loading Cleaner class in ReferenceHandler initialization would be 
sufficient to fix this intermittent failure. Or do you think instanceof 
check could throw OOME for some other reason besides loading of the class?



Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread Peter Levart

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data at 
java.home/lib/zi with JSR310's tzdb



Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread srikalyan
Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ? Please 
correct if i am missing something here. Meanwhile i will give the 
version of Reference Handler you both agreed on a try.


--
Thanks
kalyan
Ph: (408)-585-8040


On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread Peter Levart


On 01/21/2014 07:17 PM, srikalyan wrote:
Hi Peter/David, catching up after long weekend. Why would there be an 
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see 
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class 
object is allocated on heap.


Regards, Peter

Please correct if i am missing something here. Meanwhile i will give 
the version of Reference Handler you both agreed on a try.

--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data 
at java.home/lib/zi with JSR310's tzdb



Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread David Holmes

On 22/01/2014 1:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data at
java.home/lib/zi with JSR310's tzdb


I suspect it also depends on your TZ environment too as I do not see 
this on my systems.


David




Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread David Holmes

On 22/01/2014 8:31 AM, Peter Levart wrote:


On 01/21/2014 07:17 PM, srikalyan wrote:

Hi Peter/David, catching up after long weekend. Why would there be an
OOME in object heap due to class loading in perm gen space ?


The perm gen is not a problem her (JDK 8 does not have it and we see
OOME on JDK8 too). Each time a class is loaded, new java.lang.Class
object is allocated on heap.


For the bootloader classes I thought, but could easily be wrong, that 
the Class mirror did indeed go into the PermGen. But still this is not 
relevant on JDK8 where there is no PermGen. It maybe that changed as 
part of the early PermGen removal prep work that did go into 7u.


David


Regards, Peter


Please correct if i am missing something here. Meanwhile i will give
the version of Reference Handler you both agreed on a try.
--
Thanks
kalyan
Ph: (408)-585-8040

On 1/21/14, 7:24 AM, Peter Levart wrote:

On 01/21/2014 07:54 AM, Peter Levart wrote:

*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
...


I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it.


So what changed between JDK 7 and JDK 8?

I suspect the following: 8007572: Replace existing jdk timezone data
at java.home/lib/zi with JSR310's tzdb


Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-21 Thread David Holmes

Hi Peter,

On 22/01/2014 12:00 AM, Peter Levart wrote:

Hi, David, Kalyan,

Summing up the discussion, I propose the following patch for
ReferenceHandler:

http://cr.openjdk.java.net/~plevart/jdk9-dev/OOMEInReferenceHandler/webrev.01/


I can live with it, though it maybe that once Cleaner has been preloaded 
instanceof can no longer throw OOME. Can't be 100% sure. And there's 
some duplication/verbosity in the commentary that could be trimmed down :)


Any specific reason to use Unsafe to do the preload rather than 
Class.forName ? Does this force Unsafe to be loaded earlier than it 
otherwise would?


Thanks,
David



all 10 java/lang/ref tests pass on my PC (including
OOMEInReferenceHandler).

I kindly ask Kalyan to try to re-run the OOMEInReferenceHandler test
with this code and report any failure.


Thanks, Peter

On 01/21/2014 08:57 AM, David Holmes wrote:

On 21/01/2014 4:54 PM, Peter Levart wrote:


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either
Windows or Linux. Which platform are you on? Did you see it loaded
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*


Curious. I wonder what the controlling factor is ??



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good
to trigger Cleaner loading and initialization as part of
ReferenceHandler initialization to play things safe.



If we do that for Cleaner we may as well do it for
InterruptedException too.


Also, it is not that I think ReferenceHandler is responsible for
reporting OOME, but that it is responsible for reporting that it was
unable to perform a clean or enqueue because of OOME.


This would be necessary if we skipped a Reference because of OOME, but
if we just re-try until we eventually succeed, nothing is lost, nothing
to report (but a slow response)...


Agreed - just trying to clarify things.



Your suggested approach seems okay though I'm not sure why we
shouldn't help things along by calling System.gc() ourselves rather
than just yielding and hoping things will get cleaned up elsewhere.
But for the present purposes your approach will suffice I think.


Maybe my understanding is wrong but isn't the fact that OOME is rised a
consequence of that VM has already attempted to clear things up
(executing a GC round synchronously) but didn't succeed to make enough
free space to satisfy the allocation request? If this is only how some
collectors/allocators are implemented and not a general rule, then we
should put a System.gc() in place of Thread.yield(). Should we also
combine that with Thread.yield()? I'm concerned of a possibility that we
spin, consume too much CPU (ReferenceHandler thread has MAX priority) so
that other threads dont' get enough CPU time to proceed and clean things
up (we hope other threads will also get OOME and release things as their
stacks unwind...).


You are probably right about the System.gc() - OOME should be thrown
after GC fails to create space, so it really needs some other thread
to drop live references to allow further space to be reclaimed.

But note that Thread.yield() can behave badly on some linux systems
too, so spinning is still a possibility - but either way this would
only be really bad on a uniprocessor system where yield() is
unlikely to misbehave.

David
-



Regards, Peter



Thanks,
David

On 20/01/2014 6:42 PM, Peter Levart wrote:

On 01/20/2014 09:00 AM, Peter Levart wrote:

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner
throwing
OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in
this
situation. And will 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-20 Thread Peter Levart

On 01/20/2014 02:51 AM, David Holmes wrote:

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing
OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in this
situation. And will never be doing it again after the heap is freed...
So it might be good to load and initialize Cleaner class as part of
ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap while
loading and initializing the Cleaner class.


Yes - I was assuming that this would not fail the very first time and 
so the Cleaner class would already be loaded. Failing to be able to 
load the Cleaner class was one of the potential issues flagged earlier 
with this problem. I was actually assuming that Cleaner would be 
loaded already due to some actual Cleaner subclasses being used, but 
this does not happen as part of the default initialization. :( The 
irony being that if the Cleaner class is not loaded then r can not be 
an instance of Cleaner and so we would fail to load the class in a 
case where we didn't need it anyway.


What I wanted to focus on here was an OOME from the instanceof itself, 
but as you say that might trigger classloading of Cleaner (which is 
not what I was interested in).



The 2nd time the instanceof
Cleaner is executed after such OOME, the same line would throw
NoClassDefFoundError as a consequence of referencing a class that failed
initialization. Am I right?


instanceof is not one of the class initialization triggers, so we 
should not see an OOME generated due to a class initialization 
exception and so the class will not be put into the Erroneous state 
and so subsequent attempts to use the class will not automatically 
trigger NoClassdefFoundError.


If OOME occurs during actual loading/linking of the class Cleaner it 
is unclear what would happen on subsequent attempts. OOME is not a 
LinkageError that must be rethrown on subsequent attempts, and it is 
potentially a transient condition, so I would expect a re-load attempt 
to be allowed. However we are now deep into the details of the VM and 
it may well depend on the exact place from which the OOME originates.


The bottom line with the current problem is that there are multiple 
non-obvious paths by which the ReferenceHandler can encounter an OOME. 
In such cases we do not want the ReferenceHandler to terminate - which 
implies catching the OOME and continuing. However we also do not want 
to silently skip Cleaner processing or reference queue processing - as 
that would lead to hard to diagnoze bugs. But trying to report the 
problem may not be possible due to being out-of-memory. It may be that 
we need to break things up into multiple try/catch blocks, where each 
catch does a System.gc() and then reports that the OOME occurred. Of 
course the reporting must still be in a try/catch for the OOME. Though 
at some point letting the ReferenceHandler die may be the only way to 
report a major memory problem.


David


Hm... If I give -verbose:class option to run a simple test program:

public class Test { public static void main(String... a) {} }

I see Cleaner class being loaded before Test class. I don't see by which 
tread or if it might get loaded after main() starts, but I suspect that 
loading of Cleaner is not a problem here. Initialization of Cleaner 
class is not performed by ReferenceHandler thread as you pointed out. 
The instanceof does not trigger it and if it returns true then Cleaner 
has already been initialized. So there must be some other cause for 
instanceof throwing OOME...


What do you say about this variant of ReferenceHandler.run() method:

public void run() {
for (;;) {
Reference r;
Cleaner c;
synchronized (lock) {
r = pending;
if (r != null) {
// instanceof operator might throw OOME 
sometimes. Just retry after

// yielding - might have better luck next time...
try {
c = r instanceof Cleaner ? (Cleaner) r : null;
} catch (OutOfMemoryError x) {
Thread.yield();
continue;
}
pending = r.discovered;
 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-20 Thread Peter Levart


On 01/21/2014 03:22 AM, David Holmes wrote:

Hi Peter,

I do not see Cleaner being loaded prior to the main class on either 
Windows or Linux. Which platform are you on? Did you see it loaded 
before the main class or as part of executing it?


Before. The main class is empty:

public class Test { public static void main(String... a) {} }

Here's last few lines of -verbose:class:

[Loaded java.util.TimeZone from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfo from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$1 from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInput from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.DataInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
*[Loaded sun.misc.Cleaner from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]*
[Loaded java.io.ByteArrayInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$ZoneOffsetTransitionRule from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.zip.Checksum from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.zip.CRC32 from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.ZoneInfoFile$Checksum from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.TimeZone$1 from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.CalendarDate from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.BaseCalendar$Date from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.Gregorian$Date from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.util.calendar.CalendarUtils from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.jar.JarEntry from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.jar.JarFile$JarFileEntry from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.zip.ZipFile$ZipFileInputStream from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.AbstractSequentialList from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.LinkedList from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.util.LinkedList$Node from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.PrivilegedActionException from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.misc.URLClassPath$FileLoader from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.misc.Resource from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.misc.URLClassPath$FileLoader$1 from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.nio.ByteBuffered from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.PermissionCollection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.Permissions from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.net.URLConnection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.net.www.URLConnection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.net.www.protocol.file.FileURLConnection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded sun.net.www.MessageHeader from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.FilePermission from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.FilePermission$1 from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.io.FilePermissionCollection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.AllPermission from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.UnresolvedPermission from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.security.BasicPermissionCollection from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]

*[Loaded Test from file:/tmp/]*
[Loaded sun.launcher.LauncherHelper$FXHelper from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.lang.Shutdown from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]
[Loaded java.lang.Shutdown$Lock from 
/home/peter/Apps64/jdk1.8.0-ea-b121/jre/lib/rt.jar]



I'm on linux, 64bit and using official EA build 121 of JDK 8...

But if I try with JDK 7u45, I don't see it. So perhaps it would be good 
to trigger Cleaner loading and initialization as part of 
ReferenceHandler initialization to play things safe.




Also, it is not that I think ReferenceHandler is responsible for 
reporting OOME, but that it is responsible for reporting that it was 
unable to perform a clean or enqueue because of OOME.


This would be necessary if we 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-19 Thread David Holmes

Hi Peter,

On 17/01/2014 11:24 PM, Peter Levart wrote:

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing
OOME as a result of loading the Cleaner class? Wouldn't the above
code then throw some error also in ((Cleaner)r) - the checkcast,
since Cleaner class would not be successfully initialized?


Well, no. The above code would just skip Cleaner processing in this
situation. And will never be doing it again after the heap is freed...
So it might be good to load and initialize Cleaner class as part of
ReferenceHandler initialization to ensure correct operation...


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof
Cleaner is executed, OOME is thrown as a consequence of full heap while
loading and initializing the Cleaner class.


Yes - I was assuming that this would not fail the very first time and so 
the Cleaner class would already be loaded. Failing to be able to load 
the Cleaner class was one of the potential issues flagged earlier with 
this problem. I was actually assuming that Cleaner would be loaded 
already due to some actual Cleaner subclasses being used, but this does 
not happen as part of the default initialization. :( The irony being 
that if the Cleaner class is not loaded then r can not be an instance of 
Cleaner and so we would fail to load the class in a case where we didn't 
need it anyway.


What I wanted to focus on here was an OOME from the instanceof itself, 
but as you say that might trigger classloading of Cleaner (which is not 
what I was interested in).



The 2nd time the instanceof
Cleaner is executed after such OOME, the same line would throw
NoClassDefFoundError as a consequence of referencing a class that failed
initialization. Am I right?


instanceof is not one of the class initialization triggers, so we should 
not see an OOME generated due to a class initialization exception and so 
the class will not be put into the Erroneous state and so subsequent 
attempts to use the class will not automatically trigger 
NoClassdefFoundError.


If OOME occurs during actual loading/linking of the class Cleaner it is 
unclear what would happen on subsequent attempts. OOME is not a 
LinkageError that must be rethrown on subsequent attempts, and it is 
potentially a transient condition, so I would expect a re-load attempt 
to be allowed. However we are now deep into the details of the VM and it 
may well depend on the exact place from which the OOME originates.


The bottom line with the current problem is that there are multiple 
non-obvious paths by which the ReferenceHandler can encounter an OOME. 
In such cases we do not want the ReferenceHandler to terminate - which 
implies catching the OOME and continuing. However we also do not want to 
silently skip Cleaner processing or reference queue processing - as that 
would lead to hard to diagnoze bugs. But trying to report the problem 
may not be possible due to being out-of-memory. It may be that we need 
to break things up into multiple try/catch blocks, where each catch does 
a System.gc() and then reports that the OOME occurred. Of course the 
reporting must still be in a try/catch for the OOME. Though at some 
point letting the ReferenceHandler die may be the only way to report a 
major memory problem.


David

David


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-17 Thread Peter Levart

On 01/17/2014 05:38 AM, David Holmes wrote:

On 17/01/2014 1:31 PM, srikalyan chandrashekar wrote:

Hi David, the disassembled code is also attached to the bug. Per my


Sorry missed that.


analysis the exception was thrown when Reference Handler was on line 143
as put in the earlier email.


But if the numbers in the dissassembly match the BCI then 65 shows:

  65: instanceof#11 // class sun/misc/Cleaner

which makes more sense, the runtime instanceof check might encounter 
an OOME condition. I wish there was some easy way to trace into the 
full call chain as TraceExceptions doesn't show you any runtime frames :(


Still, it is easy enough to check:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing OOME 
as a result of loading the Cleaner class? Wouldn't the above code then 
throw some error also in ((Cleaner)r) - the checkcast, since Cleaner 
class would not be successfully initialized? Perhaps we should pre-load 
and initialize the Cleaner class as part of ReferenceHandler 
initialization...


Regards, Peter


Thanks,
David


--
Thanks
kalyan

On 1/16/14 6:16 PM, David Holmes wrote:

On 17/01/2014 4:48 AM, srikalyan wrote:

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with
fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for
the actual ReferenceHandler ??


Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the 
collector, so

at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by
collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 168]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c600} 
'runImpl'

'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 157]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '
  at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/
  at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.


What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers
a second OOME (likely as it tries to report information about the
first OOME). The first exception occurred in runImpl at BCI 65. Can
you disassemble (javap -c) the class you used so we can see what is at
BCI 65.

Thanks,
David




Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) 
in the
ReferenceHandler will fix the issue, if ReferenceHandler is 
considered
as part of the GC sub system then it should be alive even in the 
midst

of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-17 Thread Peter Levart

On 01/17/2014 02:00 PM, Peter Levart wrote:

On 01/17/2014 05:38 AM, David Holmes wrote:

On 17/01/2014 1:31 PM, srikalyan chandrashekar wrote:

Hi David, the disassembled code is also attached to the bug. Per my


Sorry missed that.

analysis the exception was thrown when Reference Handler was on line 
143

as put in the earlier email.


But if the numbers in the dissassembly match the BCI then 65 shows:

  65: instanceof#11 // class sun/misc/Cleaner

which makes more sense, the runtime instanceof check might encounter 
an OOME condition. I wish there was some easy way to trace into the 
full call chain as TraceExceptions doesn't show you any runtime 
frames :(


Still, it is easy enough to check:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing 
OOME as a result of loading the Cleaner class? Wouldn't the above code 
then throw some error also in ((Cleaner)r) - the checkcast, since 
Cleaner class would not be successfully initialized? 


Well, no. The above code would just skip Cleaner processing in this 
situation. And will never be doing it again after the heap is freed... 
So it might be good to load and initialize Cleaner class as part of 
ReferenceHandler initialization to ensure correct operation...


Peter

Perhaps we should pre-load and initialize the Cleaner class as part of 
ReferenceHandler initialization...


Regards, Peter


Thanks,
David


--
Thanks
kalyan

On 1/16/14 6:16 PM, David Holmes wrote:

On 17/01/2014 4:48 AM, srikalyan wrote:

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with
fastdebug
build and ReferenceHandler modified (with runImpl() added and 
called

from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one 
for

the actual ReferenceHandler ??


Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the 
collector, so
at line 143 when the execution context is in Reference handler 
there

might have been an Exception pending due to allocation done by
collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 168]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c600} 
'runImpl'

'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 157]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' 
in '

  at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 
'java/lang/

  at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.


What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers
a second OOME (likely as it tries to report information about the
first OOME). The first exception occurred in runImpl at BCI 65. Can
you disassemble (javap -c) the class you used so we can see what is at
BCI 65.

Thanks,
David




Suggested fix:
- As 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-17 Thread Peter Levart

On 01/17/2014 02:13 PM, Peter Levart wrote:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}



Hi David, Kalyan,

I've caught-up now. Just thinking: is instanceof Cleaner throwing 
OOME as a result of loading the Cleaner class? Wouldn't the above 
code then throw some error also in ((Cleaner)r) - the checkcast, 
since Cleaner class would not be successfully initialized? 


Well, no. The above code would just skip Cleaner processing in this 
situation. And will never be doing it again after the heap is freed... 
So it might be good to load and initialize Cleaner class as part of 
ReferenceHandler initialization to ensure correct operation... 


Well, yes and no. Let me try once more:

Above code will skip Cleaner processing if the 1st time instanceof 
Cleaner is executed, OOME is thrown as a consequence of full heap while 
loading and initializing the Cleaner class. The 2nd time the instanceof 
Cleaner is executed after such OOME, the same line would throw 
NoClassDefFoundError as a consequence of referencing a class that failed 
initialization. Am I right?


Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-16 Thread srikalyan

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for 
the actual ReferenceHandler ??



Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the collector, so
at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this 
explanation does not make any sense to me. What part of the log led 
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown 
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 
line 168]

for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
 thrown in interpreter method {method} {0x7feeddd3c600} 'runImpl' 
'()V' in 'java/lang/ref/Reference$ReferenceHandler'

 at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
 thrown in interpreter method {method} {0x7feeddd3c478} 'run' 
'()V' in 'java/lang/ref/Reference$ReferenceHandler'

 at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown 
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 
line 157]

for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
 thrown in interpreter method {method} {0x7feeddcaaf90} 
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '

 at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
 thrown in interpreter method {method} {0x7feeddca7298} 
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/

 at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.



Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the
ReferenceHandler will fix the issue, if ReferenceHandler is considered
as part of the GC sub system then it should be alive even in the midst
of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and call a
private runImpl() from run() in ReferenceHandler makes sense to me.


Why would we need this?

David
-



---
Thanks
kalyan

On 01/13/2014 03:57 PM, srikalyan wrote:


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:

Hi Peter the version you provided ran indefinitely(i put a 10 minute
timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If
with, it might be that fastdebug and/or -XX:+TraceExceptions changes
the execution a bit so that we can no longer reproduce the wrong
behaviour.

With fastdebug  -XX:TraceExceptions. I will try combination of
possible options(i.e without -XX:TraceEception on debug build etc) 
soon.



even if there were to be an error you cannot print the string of
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception
handler. At that time, the heap is still full. I'm printing it after
the GC has cleared the heap. You can try that it works by commenting
out the try { and corresponding } catch (OOME x) {} exception
handler...

Since there is a GC call prior to printing string i will give that a
shot with non-debug build.



- The test's running on interpreter mode, what i am watching for is
one error with trace. Without fastdebug build and
-XX:+TraceExceptions i am able to reproduce failure atleast 5
failures out of 1000 runs but with fastdebug+Trace no luck
yet(already past few 1000 runs).


It might be interesting to try with fastebug build but without the
-XX:+TraceExceptions option to see what has an effect on it. It might
also be interesting to try the modified ReferenceHandler (the 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-16 Thread srikalyan chandrashekar
Hi David, the disassembled code is also attached to the bug. Per my 
analysis the exception was thrown when Reference Handler was on line 143 
as put in the earlier email.


--
Thanks
kalyan

On 1/16/14 6:16 PM, David Holmes wrote:

On 17/01/2014 4:48 AM, srikalyan wrote:

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:
Hi Peter/David, we could finally get a trace of exception with 
fastdebug

build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for
the actual ReferenceHandler ??


Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the collector, so
at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by 
collector

which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 


line 168]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c600} 'runImpl'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 


line 157]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '
  at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/
  at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.


What you are seeing there is an OOME escaping the run() method which 
will cause the uncaughtExceptionHandler to be run which then triggers 
a second OOME (likely as it tries to report information about the 
first OOME). The first exception occurred in runImpl at BCI 65. Can 
you disassemble (javap -c) the class you used so we can see what is at 
BCI 65.


Thanks,
David




Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the
ReferenceHandler will fix the issue, if ReferenceHandler is considered
as part of the GC sub system then it should be alive even in the midst
of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and 
call a

private runImpl() from run() in ReferenceHandler makes sense to me.


Why would we need this?

David
-



---
Thanks
kalyan

On 01/13/2014 03:57 PM, srikalyan wrote:


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
Hi Peter the version you provided ran indefinitely(i put a 10 
minute

timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If
with, it might be that fastdebug and/or -XX:+TraceExceptions changes
the execution a bit so that we can no longer reproduce the wrong
behaviour.

With fastdebug  -XX:TraceExceptions. I will try combination of
possible options(i.e without -XX:TraceEception on debug build etc)
soon.



even if there were to be an error you cannot print the string of
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception
handler. At that time, the heap is still full. I'm printing it after
the GC has cleared the heap. You can try that it works by 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-16 Thread David Holmes

On 17/01/2014 1:31 PM, srikalyan chandrashekar wrote:

Hi David, the disassembled code is also attached to the bug. Per my


Sorry missed that.


analysis the exception was thrown when Reference Handler was on line 143
as put in the earlier email.


But if the numbers in the dissassembly match the BCI then 65 shows:

  65: instanceof#11 // class sun/misc/Cleaner

which makes more sense, the runtime instanceof check might encounter an 
OOME condition. I wish there was some easy way to trace into the full 
call chain as TraceExceptions doesn't show you any runtime frames :(


Still, it is easy enough to check:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}

if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}

Thanks,
David


--
Thanks
kalyan

On 1/16/14 6:16 PM, David Holmes wrote:

On 17/01/2014 4:48 AM, srikalyan wrote:

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with
fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for
the actual ReferenceHandler ??


Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the collector, so
at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by
collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,

line 168]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c600} 'runImpl'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,

line 157]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '
  at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/
  at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.


What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers
a second OOME (likely as it tries to report information about the
first OOME). The first exception occurred in runImpl at BCI 65. Can
you disassemble (javap -c) the class you used so we can see what is at
BCI 65.

Thanks,
David




Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the
ReferenceHandler will fix the issue, if ReferenceHandler is considered
as part of the GC sub system then it should be alive even in the midst
of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and
call a
private runImpl() from run() in ReferenceHandler makes sense to me.


Why would we need this?

David
-



---
Thanks
kalyan

On 01/13/2014 03:57 PM, srikalyan wrote:


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:

Hi Peter the version you provided ran indefinitely(i put a 10
minute
timeout) and the program got interrupted(no error),


Did you 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-16 Thread srikalyan chandrashekar

On 1/16/14 8:38 PM, David Holmes wrote:

On 17/01/2014 1:31 PM, srikalyan chandrashekar wrote:

Hi David, the disassembled code is also attached to the bug. Per my


Sorry missed that.


analysis the exception was thrown when Reference Handler was on line 143
as put in the earlier email.


But if the numbers in the dissassembly match the BCI then 65 shows:

  65: instanceof#11 // class sun/misc/Cleaner

which makes more sense, the runtime instanceof check might encounter 
an OOME condition. I wish there was some easy way to trace into the 
full call chain as TraceExceptions doesn't show you any runtime frames :(


Still, it is easy enough to check:

// Fast path for cleaners
boolean isCleaner = false;
try {
  isCleaner = r instanceof Cleaner;
} catch (OutofMemoryError oome) {
  continue;
}
Will get this into build and give a shot soon, in the log if you see bci 
6 and bci 48 are where dispatch and uncaught exceptions are 
raised(please correct me if i am wrong), i assumed its from 
ReferenceHandler thread as it says the same thread Id 0x7feed80cf800.


if (isCleaner) {
  ((Cleaner)r).clean();
  continue;
}

Thanks,
David


--
Thanks
kalyan

On 1/16/14 6:16 PM, David Holmes wrote:

On 17/01/2014 4:48 AM, srikalyan wrote:

Hi David

On 1/15/14, 9:04 PM, David Holmes wrote:

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with
fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for
the actual ReferenceHandler ??


Please search for ReferenceHandler in the log.

Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the 
collector, so

at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by
collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this
explanation does not make any sense to me. What part of the log led
you to this conclusion?

-- Log Excerpt begins --
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 168]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c600} 
'runImpl'

'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 65 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff7808e8)
  thrown in interpreter method {method} {0x7feeddd3c478} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at bci 1 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
thrown
[/home/srikalyc/work/ora2013/infracleanup/jdk8/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 



line 157]
for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddcaaf90}
'uncaughtException' '(Ljava/lang/Thread;Ljava/lang/Throwable;)V' in '
  at bci 48 for thread 0x7feed80cf800
Exception a 'java/lang/OutOfMemoryError' (0xff780868)
  thrown in interpreter method {method} {0x7feeddca7298}
'dispatchUncaughtException' '(Ljava/lang/Throwable;)V' in 'java/lang/
  at bci 6 for thread 0x7feed80cf800
-- Log Excerpt ends --
Sorry if it is a wrong understanding.


What you are seeing there is an OOME escaping the run() method which
will cause the uncaughtExceptionHandler to be run which then triggers
a second OOME (likely as it tries to report information about the
first OOME). The first exception occurred in runImpl at BCI 65. Can
you disassemble (javap -c) the class you used so we can see what is at
BCI 65.

Thanks,
David




Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) 
in the
ReferenceHandler will fix the issue, if ReferenceHandler is 
considered
as part of the GC sub system then it should be alive even in the 
midst

of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and
call a
private runImpl() from run() in ReferenceHandler makes sense to me.


Why 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-15 Thread srikalyan chandrashekar
Hi Peter/David, we could finally get a trace of exception with fastdebug 
build and ReferenceHandler modified (with runImpl() added and called 
from run()). The logs, disassembled code is available in JIRA 
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the collector, so 
at line 143 when the execution context is in Reference handler there 
might have been an Exception pending due to allocation done by collector 
which causes ReferenceHandler thread to die.


Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the 
ReferenceHandler will fix the issue, if ReferenceHandler is considered 
as part of the GC sub system then it should be alive even in the midst 
of an OOME so i feel that the additional guard should be allowed, 
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and call a 
private runImpl() from run() in ReferenceHandler makes sense to me.



---
Thanks
kalyan

On 01/13/2014 03:57 PM, srikalyan wrote:


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
Hi Peter the version you provided ran indefinitely(i put a 10 minute 
timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If 
with, it might be that fastdebug and/or -XX:+TraceExceptions changes 
the execution a bit so that we can no longer reproduce the wrong 
behaviour.
With fastdebug  -XX:TraceExceptions. I will try combination of 
possible options(i.e without -XX:TraceEception on debug build etc) soon.


even if there were to be an error you cannot print the string of 
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception 
handler. At that time, the heap is still full. I'm printing it after 
the GC has cleared the heap. You can try that it works by commenting 
out the try { and corresponding } catch (OOME x) {} exception 
handler...
Since there is a GC call prior to printing string i will give that a 
shot with non-debug build.


- The test's running on interpreter mode, what i am watching for is 
one error with trace. Without fastdebug build and 
-XX:+TraceExceptions i am able to reproduce failure atleast 5 
failures out of 1000 runs but with fastdebug+Trace no luck 
yet(already past few 1000 runs).


It might be interesting to try with fastebug build but without the 
-XX:+TraceExceptions option to see what has an effect on it. It might 
also be interesting to try the modified ReferenceHandler (the one 
with private runImpl() method called from run()) and with normal 
non-fastdebug JDK. This info might be useful when one starts to 
inspect the exception handling code in interpreter...


Regards, Peter



--
Thanks
kalyan
Ph: (408)-585-8040



---
Thanks
kalyan

On 01/10/2014 02:57 AM, Peter Levart wrote:

On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling 
in interpreter, I devised a hypothetical reproducer that tries to 
simulate ReferenceHandler in many aspects, but doesn't require to 
be a ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when 
thread dies. Could you run this program in the environment that 
causes the OOMEInReferenceHandler test to fail and see if it 
terminates?


I forgot to mention that in order for this long-running program to 
exhibit interpreter behaviour, it should be run with -Xint option. 
So I suggest:


-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter









Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-15 Thread David Holmes

On 16/01/2014 10:19 AM, srikalyan chandrashekar wrote:

Hi Peter/David, we could finally get a trace of exception with fastdebug
build and ReferenceHandler modified (with runImpl() added and called
from run()). The logs, disassembled code is available in JIRA
https://bugs.openjdk.java.net/browse/JDK-8022321 as attachments.


All I can see is the log for the OOMECatchingTest program not one for 
the actual ReferenceHandler ??



Observations from the log:

Root Cause:
1) UncaughtException is being dispatched from Reference.java:143
141   ReferenceObject r;
142   synchronized (lock) {
143if (pending != null) {
144r = pending;
145pending = r.discovered;
146r.discovered = null;

pending field in Reference is touched and updated by the collector, so
at line 143 when the execution context is in Reference handler there
might have been an Exception pending due to allocation done by collector
which causes ReferenceHandler thread to die.


Sorry but the GC does not trigger asynchronous exceptions so this 
explanation does not make any sense to me. What part of the log led you 
to this conclusion?



Suggested fix:
- As proposed earlier putting an outer guard(try-catch on OOME) in the
ReferenceHandler will fix the issue, if ReferenceHandler is considered
as part of the GC sub system then it should be alive even in the midst
of an OOME so i feel that the additional guard should be allowed,
however i might still be ignorant of vital implications.
- Apart from the above changes, Peter's suggestion to create and call a
private runImpl() from run() in ReferenceHandler makes sense to me.


Why would we need this?

David
-



---
Thanks
kalyan

On 01/13/2014 03:57 PM, srikalyan wrote:


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:

Hi Peter the version you provided ran indefinitely(i put a 10 minute
timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If
with, it might be that fastdebug and/or -XX:+TraceExceptions changes
the execution a bit so that we can no longer reproduce the wrong
behaviour.

With fastdebug  -XX:TraceExceptions. I will try combination of
possible options(i.e without -XX:TraceEception on debug build etc) soon.



even if there were to be an error you cannot print the string of
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception
handler. At that time, the heap is still full. I'm printing it after
the GC has cleared the heap. You can try that it works by commenting
out the try { and corresponding } catch (OOME x) {} exception
handler...

Since there is a GC call prior to printing string i will give that a
shot with non-debug build.



- The test's running on interpreter mode, what i am watching for is
one error with trace. Without fastdebug build and
-XX:+TraceExceptions i am able to reproduce failure atleast 5
failures out of 1000 runs but with fastdebug+Trace no luck
yet(already past few 1000 runs).


It might be interesting to try with fastebug build but without the
-XX:+TraceExceptions option to see what has an effect on it. It might
also be interesting to try the modified ReferenceHandler (the one
with private runImpl() method called from run()) and with normal
non-fastdebug JDK. This info might be useful when one starts to
inspect the exception handling code in interpreter...

Regards, Peter



--
Thanks
kalyan
Ph: (408)-585-8040



---
Thanks
kalyan

On 01/10/2014 02:57 AM, Peter Levart wrote:

On 01/10/2014 09:31 AM, Peter Levart wrote:

Since we suspect there's something wrong with exception handling
in interpreter, I devised a hypothetical reproducer that tries to
simulate ReferenceHandler in many aspects, but doesn't require to
be a ReferenceHandler:

http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when
thread dies. Could you run this program in the environment that
causes the OOMEInReferenceHandler test to fail and see if it
terminates?


I forgot to mention that in order for this long-running program to
exhibit interpreter behaviour, it should be run with -Xint option.
So I suggest:

-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter









Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-13 Thread srikalyan


On 1/11/14, 6:15 AM, Peter Levart wrote:


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
Hi Peter the version you provided ran indefinitely(i put a 10 minute 
timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If 
with, it might be that fastdebug and/or -XX:+TraceExceptions changes 
the execution a bit so that we can no longer reproduce the wrong 
behaviour.
With fastdebug  -XX:TraceExceptions. I will try combination of possible 
options(i.e without -XX:TraceEception on debug build etc) soon.


even if there were to be an error you cannot print the string of 
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception 
handler. At that time, the heap is still full. I'm printing it after 
the GC has cleared the heap. You can try that it works by commenting 
out the try { and corresponding } catch (OOME x) {} exception 
handler...
Since there is a GC call prior to printing string i will give that a 
shot with non-debug build.


- The test's running on interpreter mode, what i am watching for is 
one error with trace. Without fastdebug build and 
-XX:+TraceExceptions i am able to reproduce failure atleast 5 
failures out of 1000 runs but with fastdebug+Trace no luck 
yet(already past few 1000 runs).


It might be interesting to try with fastebug build but without the 
-XX:+TraceExceptions option to see what has an effect on it. It might 
also be interesting to try the modified ReferenceHandler (the one with 
private runImpl() method called from run()) and with normal 
non-fastdebug JDK. This info might be useful when one starts to 
inspect the exception handling code in interpreter...


Regards, Peter



--
Thanks
kalyan
Ph: (408)-585-8040



---
Thanks
kalyan

On 01/10/2014 02:57 AM, Peter Levart wrote:

On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling in 
interpreter, I devised a hypothetical reproducer that tries to 
simulate ReferenceHandler in many aspects, but doesn't require to 
be a ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when 
thread dies. Could you run this program in the environment that 
causes the OOMEInReferenceHandler test to fail and see if it 
terminates?


I forgot to mention that in order for this long-running program to 
exhibit interpreter behaviour, it should be run with -Xint option. 
So I suggest:


-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-11 Thread Peter Levart


On 01/10/2014 10:51 PM, srikalyan chandrashekar wrote:
Hi Peter the version you provided ran indefinitely(i put a 10 minute 
timeout) and the program got interrupted(no error),


Did you run it with or without fastedbug  -XX:+TraceExceptions ? If 
with, it might be that fastdebug and/or -XX:+TraceExceptions changes the 
execution a bit so that we can no longer reproduce the wrong behaviour.


even if there were to be an error you cannot print the string of 
thread to console(these have been attempted earlier).


...it has been attempted to print toString in uncaught exception 
handler. At that time, the heap is still full. I'm printing it after the 
GC has cleared the heap. You can try that it works by commenting out the 
try { and corresponding } catch (OOME x) {} exception handler...


- The test's running on interpreter mode, what i am watching for is 
one error with trace. Without fastdebug build and -XX:+TraceExceptions 
i am able to reproduce failure atleast 5 failures out of 1000 runs but 
with fastdebug+Trace no luck yet(already past few 1000 runs).


It might be interesting to try with fastebug build but without the 
-XX:+TraceExceptions option to see what has an effect on it. It might 
also be interesting to try the modified ReferenceHandler (the one with 
private runImpl() method called from run()) and with normal 
non-fastdebug JDK. This info might be useful when one starts to inspect 
the exception handling code in interpreter...


Regards, Peter



---
Thanks
kalyan

On 01/10/2014 02:57 AM, Peter Levart wrote:

On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling in 
interpreter, I devised a hypothetical reproducer that tries to 
simulate ReferenceHandler in many aspects, but doesn't require to be 
a ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when 
thread dies. Could you run this program in the environment that 
causes the OOMEInReferenceHandler test to fail and see if it 
terminates?


I forgot to mention that in order for this long-running program to 
exhibit interpreter behaviour, it should be run with -Xint option. So 
I suggest:


-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-10 Thread Peter Levart


On 01/10/2014 12:59 AM, srikalyan chandrashekar wrote:
David/Peter you are right, the logs trace came from passed run, i am 
trying to simulate the failure and get the logs for failed runs(2000+ 
runs done and still no failure), will get back to you once i have the 
data from failed run. Sorry for the confusion.


I doubt the logs will be any different. A simple test that throws an 
exception inside Thread.run() without catching it shows that 
TraceExceptions doesn't report the fact that Thread.run() terminates 
abruptly (as David pointed out, pending exception is reported after 
every bytecode executed and there's no bytecode that invoked Thread.run()).
While you're at it, testing, could you also test the modified 
ReferenceHandler (the one that calls private runImpl() from it's run() 
method) so that we get a proof of incorrect behaviour.


Since we suspect there's something wrong with exception handling in 
interpreter, I devised a hypothetical reproducer that tries to simulate 
ReferenceHandler in many aspects, but doesn't require to be a 
ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when thread 
dies. Could you run this program in the environment that causes the 
OOMEInReferenceHandler test to fail and see if it terminates?



Regards, Peter



---
Thanks
kalyan

On 01/08/2014 11:22 PM, David Holmes wrote:

Thanks Peter.

Kalyan: Can you confirm, as Peter asked, that the TraceExceptions 
output came from a failed run?


AFAICS the Trace info is printed after each bytecode where there is a 
pending exception - though I'm not 100% sure on the printing within 
the VM runtime. Based on that I think we see the Trace output in 
run() at the point where wait() returns, so it may well be caught 
after that - in which case this was not a failing run.


I also can't reproduce the problem :(

David

On 8/01/2014 10:34 PM, Peter Levart wrote:

On 01/08/2014 07:30 AM, David Holmes wrote:

On 8/01/2014 4:19 PM, David Holmes wrote:

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:
Hi David, TraceExceptions with fastdebug build produced some nice 
trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log 
. The
native method wait(long) is where the OOME if being thrown, the 
deepest

call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 



line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the
InterruptedException.


However we already have a catch block around the wait() so how is this
OOME getting through? A bug in exception handling in the 
interpreter ??




Might be. And it may have something to do with the fact that the
Thread.run() method is the 1st call frame on the thread's stack (seems
like corner case). The last few meaningful TraceExceptions records are:


Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 


line 157]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 


line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ca8} 'wait'
'()V' in 'java/lang/Object'
  at *bci 2* for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b48d2250} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at *bci 36* for thread 0x7f78c40d2800


Here's the relevant bytecodes:


public class java.lang.Object

   public final void wait() throws java.lang.InterruptedException;
 descriptor: ()V
 flags: ACC_PUBLIC, ACC_FINAL
 Code:
   stack=3, locals=1, args_size=1
  0: aload_0
  1: lconst_0
* 2: invokevirtual #73 // Method wait:(J)V*
  5: return
   LineNumberTable:
 line 502: 0
 line 503: 5
 Exceptions:
   throws java.lang.InterruptedException


class java.lang.ref.Reference$ReferenceHandler extends java.lang.Thread

   public void run();
 descriptor: ()V
 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-10 Thread Peter Levart

On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling in 
interpreter, I devised a hypothetical reproducer that tries to 
simulate ReferenceHandler in many aspects, but doesn't require to be a 
ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when thread 
dies. Could you run this program in the environment that causes the 
OOMEInReferenceHandler test to fail and see if it terminates?


I forgot to mention that in order for this long-running program to 
exhibit interpreter behaviour, it should be run with -Xint option. So I 
suggest:


-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-10 Thread srikalyan chandrashekar
Hi Peter the version you provided ran indefinitely(i put a 10 minute 
timeout) and the program got interrupted(no error), even if there were 
to be an error you cannot print the string of thread to console(these 
have been attempted earlier).
- The test's running on interpreter mode, what i am watching for is one 
error with trace. Without fastdebug build and -XX:+TraceExceptions i am 
able to reproduce failure atleast 5 failures out of 1000 runs but with 
fastdebug+Trace no luck yet(already past few 1000 runs).


---
Thanks
kalyan

On 01/10/2014 02:57 AM, Peter Levart wrote:

On 01/10/2014 09:31 AM, Peter Levart wrote:
Since we suspect there's something wrong with exception handling in 
interpreter, I devised a hypothetical reproducer that tries to 
simulate ReferenceHandler in many aspects, but doesn't require to be 
a ReferenceHandler:


http://cr.openjdk.java.net/~plevart/misc/OOME/OOMECatchingTest.java

This is designed to run indefinitely and only terminate if/when 
thread dies. Could you run this program in the environment that 
causes the OOMEInReferenceHandler test to fail and see if it terminates?


I forgot to mention that in order for this long-running program to 
exhibit interpreter behaviour, it should be run with -Xint option. So 
I suggest:


-Xmx24M -XX:-UseTLAB -Xint

Regards, Peter





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-09 Thread srikalyan chandrashekar
David/Peter you are right, the logs trace came from passed run, i am 
trying to simulate the failure and get the logs for failed runs(2000+ 
runs done and still no failure), will get back to you once i have the 
data from failed run. Sorry for the confusion.


---
Thanks
kalyan

On 01/08/2014 11:22 PM, David Holmes wrote:

Thanks Peter.

Kalyan: Can you confirm, as Peter asked, that the TraceExceptions 
output came from a failed run?


AFAICS the Trace info is printed after each bytecode where there is a 
pending exception - though I'm not 100% sure on the printing within 
the VM runtime. Based on that I think we see the Trace output in run() 
at the point where wait() returns, so it may well be caught after that 
- in which case this was not a failing run.


I also can't reproduce the problem :(

David

On 8/01/2014 10:34 PM, Peter Levart wrote:

On 01/08/2014 07:30 AM, David Holmes wrote:

On 8/01/2014 4:19 PM, David Holmes wrote:

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:
Hi David, TraceExceptions with fastdebug build produced some nice 
trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log 
. The
native method wait(long) is where the OOME if being thrown, the 
deepest

call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 



line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the
InterruptedException.


However we already have a catch block around the wait() so how is this
OOME getting through? A bug in exception handling in the interpreter ??



Might be. And it may have something to do with the fact that the
Thread.run() method is the 1st call frame on the thread's stack (seems
like corner case). The last few meaningful TraceExceptions records are:


Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 


line 157]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 


line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ca8} 'wait'
'()V' in 'java/lang/Object'
  at *bci 2* for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b48d2250} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at *bci 36* for thread 0x7f78c40d2800


Here's the relevant bytecodes:


public class java.lang.Object

   public final void wait() throws java.lang.InterruptedException;
 descriptor: ()V
 flags: ACC_PUBLIC, ACC_FINAL
 Code:
   stack=3, locals=1, args_size=1
  0: aload_0
  1: lconst_0
* 2: invokevirtual #73 // Method wait:(J)V*
  5: return
   LineNumberTable:
 line 502: 0
 line 503: 5
 Exceptions:
   throws java.lang.InterruptedException


class java.lang.ref.Reference$ReferenceHandler extends java.lang.Thread

   public void run();
 descriptor: ()V
 flags: ACC_PUBLIC
 Code:
   stack=2, locals=5, args_size=1
  0: invokestatic  #62 // Method
java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
  3: dup
  4: astore_2
  5: monitorenter
  6: invokestatic  #61 // Method
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
  9: ifnull33
 12: invokestatic  #61 // Method
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
 15: astore_1
 16: aload_1
 17: invokestatic  #64 // Method
java/lang/ref/Reference.access$300:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference; 


 20: invokestatic  #63 // Method
java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference; 


 23: pop
 24: aload_1
 25: aconst_null
 26: invokestatic  #65 // Method
java/lang/ref/Reference.access$302:(Ljava/lang/ref/Reference;Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference; 


 29: pop

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-08 Thread Peter Levart

On 01/08/2014 07:30 AM, David Holmes wrote:

On 8/01/2014 4:19 PM, David Holmes wrote:

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:

Hi David, TraceExceptions with fastdebug build produced some nice trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log . The
native method wait(long) is where the OOME if being thrown, the deepest
call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 


line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the
InterruptedException.


However we already have a catch block around the wait() so how is this 
OOME getting through? A bug in exception handling in the interpreter ??




Might be. And it may have something to do with the fact that the 
Thread.run() method is the 1st call frame on the thread's stack (seems 
like corner case). The last few meaningful TraceExceptions records are:



Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown 
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp, 
line 157]

for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown 
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 
line 1649]

for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
 thrown in interpreter method {method} {0x7f78b4800ae0} 'wait' 
'(J)V' in 'java/lang/Object'

 at bci 0 for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
 thrown in interpreter method {method} {0x7f78b4800ca8} 'wait' 
'()V' in 'java/lang/Object'

 at *bci 2* for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
 thrown in interpreter method {method} {0x7f78b48d2250} 'run' 
'()V' in 'java/lang/ref/Reference$ReferenceHandler'

 at *bci 36* for thread 0x7f78c40d2800


Here's the relevant bytecodes:


public class java.lang.Object

  public final void wait() throws java.lang.InterruptedException;
descriptor: ()V
flags: ACC_PUBLIC, ACC_FINAL
Code:
  stack=3, locals=1, args_size=1
 0: aload_0
 1: lconst_0
* 2: invokevirtual #73 // Method wait:(J)V*
 5: return
  LineNumberTable:
line 502: 0
line 503: 5
Exceptions:
  throws java.lang.InterruptedException


class java.lang.ref.Reference$ReferenceHandler extends java.lang.Thread

  public void run();
descriptor: ()V
flags: ACC_PUBLIC
Code:
  stack=2, locals=5, args_size=1
 0: invokestatic  #62 // Method 
java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;

 3: dup
 4: astore_2
 5: monitorenter
 6: invokestatic  #61 // Method 
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;

 9: ifnull33
12: invokestatic  #61 // Method 
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;

15: astore_1
16: aload_1
17: invokestatic  #64 // Method 
java/lang/ref/Reference.access$300:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
20: invokestatic  #63 // Method 
java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;

23: pop
24: aload_1
25: aconst_null
26: invokestatic  #65 // Method 
java/lang/ref/Reference.access$302:(Ljava/lang/ref/Reference;Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;

29: pop
30: goto  52
33: invokestatic  #62 // Method 
java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
*36: invokevirtual #59 // Method 
java/lang/Object.wait:()V*

39: goto  43
42: astore_3
43: goto  47
46: astore_3
47: aload_2
48: monitorexit
49: goto  0
52: aload_2
53: monitorexit
54: goto  64
57: astore4
59: aload_2
60: monitorexit
61: aload 4
63: athrow
64: aload_1
65: instanceof#38 // class sun/misc/Cleaner
68: ifeq  81
71: aload_1
72: checkcast #38 // class sun/misc/Cleaner
75: invokevirtual #67 // Method 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-08 Thread Sandeep Konchady
Kal,

Can you give access to Peter to the machine where you ran this test. Please 
send the details to him privately.

Thanks,
Sandeep

On Jan 8, 2014, at 12:08 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:

 Hi Peter, the jtreg test configuration is @run main/othervm -Xmx24M 
 -XX:-UseTLAB OOMEInReferenceHandler. With this option you still have to run 
 the test several times(like a 1000 runs) to capture 1(OR) more failures.  
 Platform may not have an affect, however i used a 64 bit Ubuntu 12.04 LTS , 
 8GB, 2 core workstation and any JDK(7/8).
 
 ---
 Thanks
 kalyan
 
 On 01/08/2014 05:53 AM, Peter Levart wrote:
 Hi Kalyan,
 
 What hardware/OS/JVM and what JVM options are you using to reproduce this 
 failure. I would really like to reproduce this myself, but all attempts on 
 my PC have so far been unsuccessful. I might be able to get access to a 
 machine that is similar to yours...
 
 Regards, Peter
 
 On 01/07/2014 09:55 PM, srikalyan chandrashekar wrote:
 Peter, getting state info out(to console or otherwise) from within 
 Reference Handler's exceptions handlers have been unsuccessful.  However 
 David's suggestion produced some useful trace with fast debug build and 
 could get some information , see the log here 
 http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log .
 ---
 Thanks
 kalyan
 On 01/07/2014 12:42 AM, Peter Levart wrote:
 On 01/07/2014 03:15 AM, srikalyan chandrashekar wrote:
 Sure David will give that a try, we have so far attempted to
 1. Print state data(as per the test creator peter.levart's inputs),
 
 Hi Kalyan,
 
 Have you been able to reproduce the OOME in that set-up? What was the 
 result?
 
 Regards, Peter
 
 2. Use UEH(uncaught exception handler per Mandy's inputs)
 
 -- 
 Thanks
 kalyan
 
 On 1/6/14 4:40 PM, David Holmes wrote:
 Back from vacation ...
 
 On 20/12/2013 4:49 PM, David Holmes wrote:
 On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
 Hi David Thanks for your comments, the unguarded part(clean and 
 enqueue)
 in the Reference Handler thread does not seem to create any new 
 objects,
 so it is the application(the test in this case) which is adding objects
 to heap and causing the Reference Handler to die with OOME.
 
 The ReferenceHandler thread can only get OOME if it allocates (directly
 or indirectly) - so there has to be something in the unguarded part that
 causes this. Again it may be an implicit action in the VM - similar to
 the class load issue for InterruptedException.
 
 Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
 triggered.
 
 David
 -
 
 David
 
 I am still
 unsure about the side effects of the code change and agree with your
 thoughts(on memory exhaustion test's reliability).
 
 PS: hotspot dev alias removed from CC.
 
 -- 
 Thanks
 kalyan
 
 On 12/19/13 5:08 PM, David Holmes wrote:
 Hi Kalyan,
 
 This is not a hotspot issue so I'm moving this to core-libs, please
 drop hotspot from any replies.
 
 On 20/12/2013 6:26 AM, srikalyan wrote:
 Hi all,  I have been working on the bug JDK-8022321
 https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
 sporadic
 failure and the webrev is available here
 http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/
  
 
 
 
 I'm really not sure what to make of this. We have a test that triggers
 an out-of-memory condition but the OOME can actually turn up in the
 ReferenceHandler thread causing it to terminate and the test to fail.
 We previously accounted for the non-obvious occurrences of OOME due to
 the Object.wait and the possible need to load the InterruptedException
 class - but still the OOME can appear where we don't want it. So
 finally you have just placed the whole for(;;) loop in a
 try/catch(OOME) that ignores the OOME. I'm certain that makes the test
 happy, but I'm not sure it is really what we want for the
 ReferenceHandler thread. If the OOME occurs while cleaning, or
 enqueuing then we will fail to clean and/or enqueue but there would be
 no indication that has occurred and I think that is a bigger problem
 than this test failing.
 
 There may be no way to make this test 100% reliable. In fact I'd
 suggest that no memory exhaustion test can be 100% reliable.
 
 David
 
 *
 **Root Cause:Still not known*
 2 places where there is a possibility for OOME
 1) Cleaner.clean()
 2) ReferenceQueue.enqueue()
 
 1)  The cleanup code in turn has 2 places where there is potential 
 for
 throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable 
 is
 passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
 However none of the above overridden implementations ever create an
 object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
 clean() method but 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-08 Thread David Holmes

Thanks Peter.

Kalyan: Can you confirm, as Peter asked, that the TraceExceptions output 
came from a failed run?


AFAICS the Trace info is printed after each bytecode where there is a 
pending exception - though I'm not 100% sure on the printing within the 
VM runtime. Based on that I think we see the Trace output in run() at 
the point where wait() returns, so it may well be caught after that - in 
which case this was not a failing run.


I also can't reproduce the problem :(

David

On 8/01/2014 10:34 PM, Peter Levart wrote:

On 01/08/2014 07:30 AM, David Holmes wrote:

On 8/01/2014 4:19 PM, David Holmes wrote:

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:

Hi David, TraceExceptions with fastdebug build produced some nice trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log . The
native method wait(long) is where the OOME if being thrown, the deepest
call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp,

line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the
InterruptedException.


However we already have a catch block around the wait() so how is this
OOME getting through? A bug in exception handling in the interpreter ??



Might be. And it may have something to do with the fact that the
Thread.run() method is the 1st call frame on the thread's stack (seems
like corner case). The last few meaningful TraceExceptions records are:


Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/gc_interface/collectedHeap.inline.hpp,
line 157]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp,
line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ca8} 'wait'
'()V' in 'java/lang/Object'
  at *bci 2* for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b48d2250} 'run'
'()V' in 'java/lang/ref/Reference$ReferenceHandler'
  at *bci 36* for thread 0x7f78c40d2800


Here's the relevant bytecodes:


public class java.lang.Object

   public final void wait() throws java.lang.InterruptedException;
 descriptor: ()V
 flags: ACC_PUBLIC, ACC_FINAL
 Code:
   stack=3, locals=1, args_size=1
  0: aload_0
  1: lconst_0
* 2: invokevirtual #73 // Method wait:(J)V*
  5: return
   LineNumberTable:
 line 502: 0
 line 503: 5
 Exceptions:
   throws java.lang.InterruptedException


class java.lang.ref.Reference$ReferenceHandler extends java.lang.Thread

   public void run();
 descriptor: ()V
 flags: ACC_PUBLIC
 Code:
   stack=2, locals=5, args_size=1
  0: invokestatic  #62 // Method
java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
  3: dup
  4: astore_2
  5: monitorenter
  6: invokestatic  #61 // Method
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
  9: ifnull33
 12: invokestatic  #61 // Method
java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference;
 15: astore_1
 16: aload_1
 17: invokestatic  #64 // Method
java/lang/ref/Reference.access$300:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
 20: invokestatic  #63 // Method
java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
 23: pop
 24: aload_1
 25: aconst_null
 26: invokestatic  #65 // Method
java/lang/ref/Reference.access$302:(Ljava/lang/ref/Reference;Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference;
 29: pop
 30: goto  52
 33: invokestatic  #62 // Method
java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock;
*36: invokevirtual #59 // Method
java/lang/Object.wait:()V*
 39: goto  43
 42: astore_3
 43: goto  47
 46: astore_3
  

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-07 Thread Peter Levart

On 01/07/2014 03:15 AM, srikalyan chandrashekar wrote:

Sure David will give that a try, we have so far attempted to
1. Print state data(as per the test creator peter.levart's inputs),


Hi Kalyan,

Have you been able to reproduce the OOME in that set-up? What was the 
result?


Regards, Peter


2. Use UEH(uncaught exception handler per Mandy's inputs)

--
Thanks
kalyan

On 1/6/14 4:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue)
in the Reference Handler thread does not seem to create any new 
objects,
so it is the application(the test in this case) which is adding 
objects

to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part 
that

causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
triggered.


David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 






I'm really not sure what to make of this. We have a test that 
triggers

an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME 
due to
the Object.wait and the possible need to load the 
InterruptedException

class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the 
test

happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there 
would be

no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is 
potential for

throwing OOME,
 a) thunk Thread which is run from clean() method. This 
Runnable is

passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to
cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for 
every

1000 run.

*PS*: The code change is to a very critical part of JDK and i am 
fully
not aware of the consequences of the change, hence seeking expert 
help

here. Appreciate your time and inputs towards this.









Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-07 Thread srikalyan chandrashekar
Peter, getting state info out(to console or otherwise) from within 
Reference Handler's exceptions handlers have been unsuccessful.  However 
David's suggestion produced some useful trace with fast debug build and 
could get some information , see the log here 
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log .


---
Thanks
kalyan

On 01/07/2014 12:42 AM, Peter Levart wrote:

On 01/07/2014 03:15 AM, srikalyan chandrashekar wrote:

Sure David will give that a try, we have so far attempted to
1. Print state data(as per the test creator peter.levart's inputs),


Hi Kalyan,

Have you been able to reproduce the OOME in that set-up? What was the 
result?


Regards, Peter


2. Use UEH(uncaught exception handler per Mandy's inputs)

--
Thanks
kalyan

On 1/6/14 4:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue)
in the Reference Handler thread does not seem to create any new 
objects,
so it is the application(the test in this case) which is adding 
objects

to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates 
(directly
or indirectly) - so there has to be something in the unguarded part 
that

causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
triggered.


David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 






I'm really not sure what to make of this. We have a test that 
triggers

an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to 
fail.
We previously accounted for the non-obvious occurrences of OOME 
due to
the Object.wait and the possible need to load the 
InterruptedException

class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the 
test

happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there 
would be

no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is 
potential for

throwing OOME,
 a) thunk Thread which is run from clean() method. This 
Runnable is

passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception 
block of

clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be 
thrown.


2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to
cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for 
every

1000 run.

*PS*: The code change is to a very critical part of JDK and i am 
fully
not aware of the consequences of the change, hence seeking 
expert help

here. Appreciate your time and inputs towards this.











Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-07 Thread srikalyan chandrashekar
Hi David, TraceExceptions with fastdebug build produced some nice trace 
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log . The 
native method wait(long) is where the OOME if being thrown, the deepest 
call is in


src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157

--- Excerpt Begins -

147  if (!gc_overhead_limit_was_exceeded) {
148// -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError support
149report_java_out_of_memory(Java heap space);
150
151if (JvmtiExport::should_post_resource_exhausted()) {
152  JvmtiExport::post_resource_exhausted(
153JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR | 
JVMTI_RESOURCE_EXHAUSTED_JAVA_HEAP,
154Java heap space);
155}
156
157THROW_OOP_0(Universe::out_of_memory_error_java_heap());
158  } else {

--- Excerpt Ends -


Would be helpful if David/some one else in the team could explain the 
latent aspects/probable cause.


---
Thanks
kalyan

On 01/06/2014 04:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue)
in the Reference Handler thread does not seem to create any new 
objects,

so it is the application(the test in this case) which is adding objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part that
causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
triggered.


David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 






I'm really not sure what to make of this. We have a test that triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME due to
the Object.wait and the possible need to load the InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential 
for

throwing OOME,
 a) thunk Thread which is run from clean() method. This 
Runnable is

passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to
cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am 
fully
not aware of the consequences of the change, hence seeking expert 
help

here. Appreciate your time and inputs towards this.







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-07 Thread David Holmes

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:

Hi David, TraceExceptions with fastdebug build produced some nice trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log . The
native method wait(long) is where the OOME if being thrown, the deepest
call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown 
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp, 
line 1649]

for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
 thrown in interpreter method {method} {0x7f78b4800ae0} 'wait' 
'(J)V' in 'java/lang/Object'

 at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the 
InterruptedException.


David
-


--- Excerpt Begins -

147  if (!gc_overhead_limit_was_exceeded) {
148// -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError
support
149report_java_out_of_memory(Java heap space);
150
151if (JvmtiExport::should_post_resource_exhausted()) {
152  JvmtiExport::post_resource_exhausted(
153JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR |
JVMTI_RESOURCE_EXHAUSTED_JAVA_HEAP,
154Java heap space);
155}
156
157THROW_OOP_0(Universe::out_of_memory_error_java_heap());
158  } else {

--- Excerpt Ends -


Would be helpful if David/some one else in the team could explain the
latent aspects/probable cause.

---
Thanks
kalyan

On 01/06/2014 04:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:

Hi David Thanks for your comments, the unguarded part(clean and
enqueue)
in the Reference Handler thread does not seem to create any new
objects,
so it is the application(the test in this case) which is adding objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part that
causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is
triggered.

David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a
sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/





I'm really not sure what to make of this. We have a test that triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME due to
the Object.wait and the possible need to load the InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential
for
throwing OOME,
 a) thunk Thread which is run from clean() method. This
Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-07 Thread David Holmes

On 8/01/2014 4:19 PM, David Holmes wrote:

On 8/01/2014 7:33 AM, srikalyan chandrashekar wrote:

Hi David, TraceExceptions with fastdebug build produced some nice trace
http://cr.openjdk.java.net/%7Esrikchan/OOME_exception_trace.log . The
native method wait(long) is where the OOME if being thrown, the deepest
call is in

src/share/vm/gc_interface/collectedHeap.inline.hpp, line 157


Yes but it is the caller that is of interest:

Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
thrown
[/HUDSON/workspace/8-2-build-linux-amd64/jdk8/1317/hotspot/src/share/vm/runtime/objectMonitor.cpp,
line 1649]
for thread 0x7f78c40d2800
Exception a 'java/lang/OutOfMemoryError' (0xd6a01840)
  thrown in interpreter method {method} {0x7f78b4800ae0} 'wait'
'(J)V' in 'java/lang/Object'
  at bci 0 for thread 0x7f78c40d2800

The ReferenceHandler thread gets the OOME trying to allocate the
InterruptedException.


However we already have a catch block around the wait() so how is this 
OOME getting through? A bug in exception handling in the interpreter ??


David


David
-


--- Excerpt Begins -

147  if (!gc_overhead_limit_was_exceeded) {
148// -XX:+HeapDumpOnOutOfMemoryError and -XX:OnOutOfMemoryError
support
149report_java_out_of_memory(Java heap space);
150
151if (JvmtiExport::should_post_resource_exhausted()) {
152  JvmtiExport::post_resource_exhausted(
153JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR |
JVMTI_RESOURCE_EXHAUSTED_JAVA_HEAP,
154Java heap space);
155}
156
157THROW_OOP_0(Universe::out_of_memory_error_java_heap());
158  } else {

--- Excerpt Ends -


Would be helpful if David/some one else in the team could explain the
latent aspects/probable cause.

---
Thanks
kalyan

On 01/06/2014 04:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:

Hi David Thanks for your comments, the unguarded part(clean and
enqueue)
in the Reference Handler thread does not seem to create any new
objects,
so it is the application(the test in this case) which is adding
objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part
that
causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is
triggered.

David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a
sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/






I'm really not sure what to make of this. We have a test that
triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME
due to
the Object.wait and the possible need to load the
InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the
test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there
would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential
for
throwing OOME,
 a) thunk Thread which is run from clean() method. This
Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-06 Thread David Holmes

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:

Hi David Thanks for your comments, the unguarded part(clean and enqueue)
in the Reference Handler thread does not seem to create any new objects,
so it is the application(the test in this case) which is adding objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part that
causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is triggered.

David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/




I'm really not sure what to make of this. We have a test that triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME due to
the Object.wait and the possible need to load the InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to
cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2014-01-06 Thread srikalyan chandrashekar

Sure David will give that a try, we have so far attempted to
1. Print state data(as per the test creator peter.levart's inputs),
2. Use UEH(uncaught exception handler per Mandy's inputs)

--
Thanks
kalyan

On 1/6/14 4:40 PM, David Holmes wrote:

Back from vacation ...

On 20/12/2013 4:49 PM, David Holmes wrote:

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue)
in the Reference Handler thread does not seem to create any new 
objects,

so it is the application(the test in this case) which is adding objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly
or indirectly) - so there has to be something in the unguarded part that
causes this. Again it may be an implicit action in the VM - similar to
the class load issue for InterruptedException.


Run a debug VM with -XX:+TraceExceptions to see where the OOME is 
triggered.


David
-


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 






I'm really not sure what to make of this. We have a test that triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME due to
the Object.wait and the possible need to load the InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential 
for

throwing OOME,
 a) thunk Thread which is run from clean() method. This 
Runnable is

passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor
anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to
cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am 
fully
not aware of the consequences of the change, hence seeking expert 
help

here. Appreciate your time and inputs towards this.







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-23 Thread srikalyan chandrashekar
Hi Mandy, after some trials i could simulate the failure again (now with 
UEH in place), however the UEH now cannot print enough details as it 
also tries to allocate memory, when it does Thread.getName()(it 
internally creates a String object), printStackTrace() also creates new 
WrappedPrintStream object. See the following trace


Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread Reference Handler

ERROR: java.lang.Exception: Reference Handler thread died.
at OOMEInReferenceHandler.main(OOMEInReferenceHandler.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)

at java.lang.Thread.run(Thread.java:744)


Meanwhile i am trying looking around to actually print something useful 
without allocating any new memory.


---
Thanks
kalyan

On 12/20/2013 01:00 PM, srikalyan wrote:
Hi Mandy, yes I ran with JTreg to simulate the failure, i will try the 
UEH patch to see if it sheds some light and get back to you. Thanks 
for the direction :)


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:33 PM, Mandy Chung wrote:

Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.  I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
b/test/java/lang/ref/OOMEInReferenceHand

ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
  return first;
  }

+ static class UEH implements Thread.UncaughtExceptionHandler {
+ public void uncaughtException(Thread t, Throwable e) {
+ System.err.println(ERROR:  + t.getName() +  
exception  +

+ e.getMessage());
+ e.printStackTrace();
+ }
+ }
+
  public static void main(String[] args) throws Exception {
  // preinitialize the InterruptedException class so that the 
reference handler
  // does not die due to OOME when loading the class if it is 
the first use

@@ -77,6 +85,8 @@
  throw new IllegalStateException(Couldn't find 
Reference Handler thread.);

  }

+ referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
  ReferenceQueueObject refQueue = new ReferenceQueue();
  Object referent = new Object();
  WeakReferenceObject weakRef = new 
WeakReference(referent, refQueue);


On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue) in the Reference Handler thread does not seem to create any 
new objects, so it is the application(the test in this case) which 
is adding objects to heap and causing the Reference Handler to die 
with OOME. I am still unsure about the side effects of the code 
change and agree with your thoughts(on memory exhaustion test's 
reliability).


PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please 
drop hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 



I'm really not sure what to make of this. We have a test that 
triggers an out-of-memory condition but the OOME can actually turn 
up in the ReferenceHandler thread causing it to terminate and the 
test to fail. We previously accounted for the non-obvious 
occurrences of OOME due to the Object.wait and the possible need to 
load the InterruptedException class - but still the OOME can appear 
where we don't want it. So finally you have just placed the whole 
for(;;) loop in a try/catch(OOME) that ignores the OOME. I'm 
certain that makes the test happy, but I'm not sure it is really 
what we want for the ReferenceHandler thread. If the OOME occurs 
while cleaning, or enqueuing then we will fail to clean and/or 
enqueue but there would be no indication that has occurred and I 
think that is a bigger problem than this test failing.


There may be no way to make this test 100% reliable. In fact I'd 
suggest that no memory exhaustion test can be 100% reliable.


David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-23 Thread Mandy Chung

On 12/23/2013 2:02 PM, srikalyan chandrashekar wrote:
Hi Mandy, after some trials i could simulate the failure again (now 
with UEH in place), however the UEH now cannot print enough details as 
it also tries to allocate memory, when it does Thread.getName()(it 
internally creates a String object), printStackTrace() also creates 
new WrappedPrintStream object. See the following trace




That's what I later also thought that may run into after suggesting UEH 
and no object can be allocated at this point.


It worths trying Peter's suggestion to override the modified version of 
Reference class with instrumentation and see what you will get.


Mandy

Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread Reference Handler

ERROR: java.lang.Exception: Reference Handler thread died.
at OOMEInReferenceHandler.main(OOMEInReferenceHandler.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)

at java.lang.Thread.run(Thread.java:744)


Meanwhile i am trying looking around to actually print something 
useful without allocating any new memory.


---
Thanks
kalyan

On 12/20/2013 01:00 PM, srikalyan wrote:
Hi Mandy, yes I ran with JTreg to simulate the failure, i will try 
the UEH patch to see if it sheds some light and get back to you. 
Thanks for the direction :)


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:33 PM, Mandy Chung wrote:

Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.  I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
b/test/java/lang/ref/OOMEInReferenceHand

ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
  return first;
  }

+ static class UEH implements Thread.UncaughtExceptionHandler {
+ public void uncaughtException(Thread t, Throwable e) {
+ System.err.println(ERROR:  + t.getName() +  
exception  +

+ e.getMessage());
+ e.printStackTrace();
+ }
+ }
+
  public static void main(String[] args) throws Exception {
  // preinitialize the InterruptedException class so that 
the reference handler
  // does not die due to OOME when loading the class if it 
is the first use

@@ -77,6 +85,8 @@
  throw new IllegalStateException(Couldn't find 
Reference Handler thread.);

  }

+ referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
  ReferenceQueueObject refQueue = new ReferenceQueue();
  Object referent = new Object();
  WeakReferenceObject weakRef = new 
WeakReference(referent, refQueue);


On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue) in the Reference Handler thread does not seem to create 
any new objects, so it is the application(the test in this case) 
which is adding objects to heap and causing the Reference Handler 
to die with OOME. I am still unsure about the side effects of the 
code change and agree with your thoughts(on memory exhaustion 
test's reliability).


PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, 
please drop hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 



I'm really not sure what to make of this. We have a test that 
triggers an out-of-memory condition but the OOME can actually turn 
up in the ReferenceHandler thread causing it to terminate and the 
test to fail. We previously accounted for the non-obvious 
occurrences of OOME due to the Object.wait and the possible need 
to load the InterruptedException class - but still the OOME can 
appear where we don't want it. So finally you have just placed the 
whole for(;;) loop in a try/catch(OOME) that ignores the OOME. I'm 
certain that makes the test happy, but I'm not sure it is really 
what we want for the ReferenceHandler thread. If the OOME occurs 
while cleaning, or enqueuing then we will fail to clean and/or 
enqueue but there would be no indication that has 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-23 Thread Mandy Chung


On 12/21/2013 8:50 AM, Peter Levart wrote:
Is it possible to get the test output when it fails? It can fail in 
two different ways. I can't look at the bug (not authorized)...


You should be able to look at it now.  There isn't any other information 
besides OOME error.


Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread 
Reference Handler
java.lang.Exception: Reference Handler thread died.
at OOMEInReferenceHandler.main(OOMEInReferenceHandler.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)
at java.lang.Thread.run(Thread.java:724)

Mandy


Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-21 Thread Peter Levart

Hi David,

Is it possible to get the test output when it fails? It can fail in two 
different ways. I can't look at the bug (not authorized)...



On 12/20/2013 10:54 AM, Chris Hegarty wrote:

On 20 Dec 2013, at 04:33, Mandy Chung mandy.ch...@oracle.com wrote:


Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.

+1. With this, at least the next time we see this failure we should have a 
better idea where the OOM is coming from.

-Chris.


We can try, but I think the VM already prints the stack-trace of the 
exception by default and as far as I remember, OOME thrown by VM is 
preallocated and does not contain a stack trace. So I suspect we'll see 
nothing more with the suggested UEH.


Is it possible to include in test, a modified version of Reference class 
that would be prepended to boot-classpath? For example, containing the 
following ReferenceHandler:



private static class ReferenceHandler extends Thread {

ReferenceHandler(ThreadGroup g, String name) {
super(g, name);
}

private volatile int state;

@Override
public String toString() {
return super.toString() + [state= + state + ];
}

public void run() {
for (;;) {
state = 1;
ReferenceObject r;
state = 2;
synchronized (lock) {
state = 3;
if (pending != null) {
state = 4;
r = pending;
state = 5;
pending = r.discovered;
state = 6;
r.discovered = null;
state = 7;
} else {
state = 8;
// The waiting on the lock may cause an OOME 
because it may try to allocate
// exception objects, so also catch OOME here 
to avoid silent exit of the

// reference handler thread.
//
// Explicitly define the order of the two 
exceptions we catch here

// when waiting for the lock.
//
// We do not want to try to potentially load 
the InterruptedException class
// (which would be done if this was its first 
use, and InterruptedException

// were checked first) in this situation.
//
// This may lead to the VM not ever trying to 
load the InterruptedException

// class again.
try {
state = 9;
try {
state = 10;
lock.wait();
state = 11;
} catch (InterruptedException x) { state = 
12; }

state = 13;
} catch (OutOfMemoryError x) { state = 14; }
state = 15;
continue;
}
state = 16;
}
state = 17;

// Fast path for cleaners
if (r instanceof Cleaner) {
state = 18;
((Cleaner)r).clean();
state = 19;
continue;
}
state = 20;

ReferenceQueueObject q = (ReferenceQueue) r.queue;
state = 21;
if (q != ReferenceQueue.NULL) q.enqueue(r);
state = 22;
}
}
}




...then just include the toString of referenceHandlerThread instance as 
part of the exception message at the end of the test:


...
...
 // wait at most 10 seconds for success or failure
 for (int i = 0; i  20; i++) {
 if (refQueue.poll() != null) {
 // Reference Handler thread still working - success
 return;
 }
 System.gc();
 Thread.sleep(500L); // wait a little to allow GC to do 
it's work before allocating objects

 if (!referenceHandlerThread.isAlive()) {
 // Reference Handler thread died - failure
 throw new Exception(Reference Handler thread died. 
referenceHandlerThread:  + referenceHandlerThread);

 }
 }

 // no sure answer after 10 seconds
 throw new IllegalStateException(Reference Handler thread 
stuck. weakRef.get():  + weakRef.get() +
  , referenceHandlerThread:  
+ referenceHandlerThread);

 }


This might be safer than using UEH since at the time the 
UEH.uncaughtException() is called, the heap might still be full which 
would prevent printing the message. 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-20 Thread Chris Hegarty

On 20 Dec 2013, at 04:33, Mandy Chung mandy.ch...@oracle.com wrote:

 Hi Srikalyan,
 
 Maybe you can get add an uncaught handler to see if you can get
 any information.  

+1. With this, at least the next time we see this failure we should have a 
better idea where the OOM is coming from.

-Chris.

 I ran it for 1000 times but not able to duplicate
 the failure.  Did you run it with jtreg (I didn't)?
 
 Below is the patch to install a thread's uncaught handler that
 you can take and try.
 
 diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
 b/test/java/lang/ref/OOMEInReferenceHand
 ler.java
 --- a/test/java/lang/ref/OOMEInReferenceHandler.java
 +++ b/test/java/lang/ref/OOMEInReferenceHandler.java
 @@ -51,6 +51,14 @@
  return first;
  }
 
 + static class UEH implements Thread.UncaughtExceptionHandler {
 + public void uncaughtException(Thread t, Throwable e) {
 + System.err.println(ERROR:  + t.getName() +  exception  +
 + e.getMessage());
 + e.printStackTrace();
 + }
 + }
 +
  public static void main(String[] args) throws Exception {
  // preinitialize the InterruptedException class so that the 
 reference handler
  // does not die due to OOME when loading the class if it is the 
 first use
 @@ -77,6 +85,8 @@
  throw new IllegalStateException(Couldn't find Reference Handler 
 thread.);
  }
 
 + referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
 +
  ReferenceQueueObject refQueue = new ReferenceQueue();
  Object referent = new Object();
  WeakReferenceObject weakRef = new WeakReference(referent, 
 refQueue);
 
 On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
 Hi David Thanks for your comments, the unguarded part(clean and enqueue) in 
 the Reference Handler thread does not seem to create any new objects, so it 
 is the application(the test in this case) which is adding objects to heap 
 and causing the Reference Handler to die with OOME. I am still unsure about 
 the side effects of the code change and agree with your thoughts(on memory 
 exhaustion test's reliability).
 
 PS: hotspot dev alias removed from CC.
 
 -- 
 Thanks
 kalyan
 
 On 12/19/13 5:08 PM, David Holmes wrote:
 Hi Kalyan,
 
 This is not a hotspot issue so I'm moving this to core-libs, please drop 
 hotspot from any replies.
 
 On 20/12/2013 6:26 AM, srikalyan wrote:
 Hi all,  I have been working on the bug JDK-8022321
 https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
 failure and the webrev is available here
 http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/
  
 
 I'm really not sure what to make of this. We have a test that triggers an 
 out-of-memory condition but the OOME can actually turn up in the 
 ReferenceHandler thread causing it to terminate and the test to fail. We 
 previously accounted for the non-obvious occurrences of OOME due to the 
 Object.wait and the possible need to load the InterruptedException class - 
 but still the OOME can appear where we don't want it. So finally you have 
 just placed the whole for(;;) loop in a try/catch(OOME) that ignores the 
 OOME. I'm certain that makes the test happy, but I'm not sure it is really 
 what we want for the ReferenceHandler thread. If the OOME occurs while 
 cleaning, or enqueuing then we will fail to clean and/or enqueue but there 
 would be no indication that has occurred and I think that is a bigger 
 problem than this test failing.
 
 There may be no way to make this test 100% reliable. In fact I'd suggest 
 that no memory exhaustion test can be 100% reliable.
 
 David
 
 *
 **Root Cause:Still not known*
 2 places where there is a possibility for OOME
 1) Cleaner.clean()
 2) ReferenceQueue.enqueue()
 
 1)  The cleanup code in turn has 2 places where there is potential for
 throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
 passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
 However none of the above overridden implementations ever create an
 object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
 clean() method but for this object to be created and to be held
 responsible for OOME an Exception(other than OOME) has to be thrown.
 
 2) No new heap objects are created in the enqueue method nor anywhere in
 the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
 potential cause.
 
 *Experimental change to java.lang.Reference.java* :
 - Put one more guard (try catch with OOME block) in the Reference
 Handler Thread which may give the Reference Handler a chance to cleanup.
 This is fixing the test failure (several 1000 runs with 0 failures)
 - Without the above change 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-20 Thread srikalyan
Hi Mandy, yes I ran with JTreg to simulate the failure, i will try the 
UEH patch to see if it sheds some light and get back to you. Thanks for 
the direction :)


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:33 PM, Mandy Chung wrote:

Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.  I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
b/test/java/lang/ref/OOMEInReferenceHand

ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
  return first;
  }

+ static class UEH implements Thread.UncaughtExceptionHandler {
+ public void uncaughtException(Thread t, Throwable e) {
+ System.err.println(ERROR:  + t.getName() +  exception 
 +

+ e.getMessage());
+ e.printStackTrace();
+ }
+ }
+
  public static void main(String[] args) throws Exception {
  // preinitialize the InterruptedException class so that the 
reference handler
  // does not die due to OOME when loading the class if it is 
the first use

@@ -77,6 +85,8 @@
  throw new IllegalStateException(Couldn't find Reference 
Handler thread.);

  }

+ referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
  ReferenceQueueObject refQueue = new ReferenceQueue();
  Object referent = new Object();
  WeakReferenceObject weakRef = new 
WeakReference(referent, refQueue);


On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue) in the Reference Handler thread does not seem to create any 
new objects, so it is the application(the test in this case) which is 
adding objects to heap and causing the Reference Handler to die with 
OOME. I am still unsure about the side effects of the code change and 
agree with your thoughts(on memory exhaustion test's reliability).


PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please 
drop hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a 
sporadic

failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 



I'm really not sure what to make of this. We have a test that 
triggers an out-of-memory condition but the OOME can actually turn 
up in the ReferenceHandler thread causing it to terminate and the 
test to fail. We previously accounted for the non-obvious 
occurrences of OOME due to the Object.wait and the possible need to 
load the InterruptedException class - but still the OOME can appear 
where we don't want it. So finally you have just placed the whole 
for(;;) loop in a try/catch(OOME) that ignores the OOME. I'm certain 
that makes the test happy, but I'm not sure it is really what we 
want for the ReferenceHandler thread. If the OOME occurs while 
cleaning, or enqueuing then we will fail to clean and/or enqueue but 
there would be no indication that has occurred and I think that is a 
bigger problem than this test failing.


There may be no way to make this test 100% reliable. In fact I'd 
suggest that no memory exhaustion test can be 100% reliable.


David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This 
Runnable is

passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor 
anywhere in

the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to 
cleanup.

This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times 

Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-19 Thread David Holmes

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please drop 
hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/


I'm really not sure what to make of this. We have a test that triggers 
an out-of-memory condition but the OOME can actually turn up in the 
ReferenceHandler thread causing it to terminate and the test to fail. We 
previously accounted for the non-obvious occurrences of OOME due to the 
Object.wait and the possible need to load the InterruptedException class 
- but still the OOME can appear where we don't want it. So finally you 
have just placed the whole for(;;) loop in a try/catch(OOME) that 
ignores the OOME. I'm certain that makes the test happy, but I'm not 
sure it is really what we want for the ReferenceHandler thread. If the 
OOME occurs while cleaning, or enqueuing then we will fail to clean 
and/or enqueue but there would be no indication that has occurred and I 
think that is a bigger problem than this test failing.


There may be no way to make this test 100% reliable. In fact I'd suggest 
that no memory exhaustion test can be 100% reliable.


David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.



Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-19 Thread srikalyan chandrashekar
Hi David Thanks for your comments, the unguarded part(clean and enqueue) 
in the Reference Handler thread does not seem to create any new objects, 
so it is the application(the test in this case) which is adding objects 
to heap and causing the Reference Handler to die with OOME. I am still 
unsure about the side effects of the code change and agree with your 
thoughts(on memory exhaustion test's reliability).


PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please 
drop hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 



I'm really not sure what to make of this. We have a test that triggers 
an out-of-memory condition but the OOME can actually turn up in the 
ReferenceHandler thread causing it to terminate and the test to fail. 
We previously accounted for the non-obvious occurrences of OOME due to 
the Object.wait and the possible need to load the InterruptedException 
class - but still the OOME can appear where we don't want it. So 
finally you have just placed the whole for(;;) loop in a 
try/catch(OOME) that ignores the OOME. I'm certain that makes the test 
happy, but I'm not sure it is really what we want for the 
ReferenceHandler thread. If the OOME occurs while cleaning, or 
enqueuing then we will fail to clean and/or enqueue but there would be 
no indication that has occurred and I think that is a bigger problem 
than this test failing.


There may be no way to make this test 100% reliable. In fact I'd 
suggest that no memory exhaustion test can be 100% reliable.


David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.





Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-19 Thread Mandy Chung

Hi Srikalyan,

Maybe you can get add an uncaught handler to see if you can get
any information.  I ran it for 1000 times but not able to duplicate
the failure.  Did you run it with jtreg (I didn't)?

Below is the patch to install a thread's uncaught handler that
you can take and try.

diff --git a/test/java/lang/ref/OOMEInReferenceHandler.java 
b/test/java/lang/ref/OOMEInReferenceHand
ler.java
--- a/test/java/lang/ref/OOMEInReferenceHandler.java
+++ b/test/java/lang/ref/OOMEInReferenceHandler.java
@@ -51,6 +51,14 @@
  return first;
  }

+ static class UEH implements Thread.UncaughtExceptionHandler {
+ public void uncaughtException(Thread t, Throwable e) {
+ System.err.println(ERROR:  + t.getName() +  exception  +
+ e.getMessage());
+ e.printStackTrace();
+ }
+ }
+
  public static void main(String[] args) throws Exception {
  // preinitialize the InterruptedException class so that the reference 
handler
  // does not die due to OOME when loading the class if it is the first 
use
@@ -77,6 +85,8 @@
  throw new IllegalStateException(Couldn't find Reference Handler 
thread.);
  }

+ referenceHandlerThread.setUncaughtExceptionHandler(new UEH());
+
  ReferenceQueueObject refQueue = new ReferenceQueue();
  Object referent = new Object();
  WeakReferenceObject weakRef = new WeakReference(referent, 
refQueue);

On 12/19/2013 6:57 PM, srikalyan chandrashekar wrote:
Hi David Thanks for your comments, the unguarded part(clean and 
enqueue) in the Reference Handler thread does not seem to create any 
new objects, so it is the application(the test in this case) which is 
adding objects to heap and causing the Reference Handler to die with 
OOME. I am still unsure about the side effects of the code change and 
agree with your thoughts(on memory exhaustion test's reliability).


PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please 
drop hotspot from any replies.


On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/ 



I'm really not sure what to make of this. We have a test that 
triggers an out-of-memory condition but the OOME can actually turn up 
in the ReferenceHandler thread causing it to terminate and the test 
to fail. We previously accounted for the non-obvious occurrences of 
OOME due to the Object.wait and the possible need to load the 
InterruptedException class - but still the OOME can appear where we 
don't want it. So finally you have just placed the whole for(;;) loop 
in a try/catch(OOME) that ignores the OOME. I'm certain that makes 
the test happy, but I'm not sure it is really what we want for the 
ReferenceHandler thread. If the OOME occurs while cleaning, or 
enqueuing then we will fail to clean and/or enqueue but there would 
be no indication that has occurred and I think that is a bigger 
problem than this test failing.


There may be no way to make this test 100% reliable. In fact I'd 
suggest that no memory exhaustion test can be 100% reliable.


David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor 
anywhere in

the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to 
cleanup.

This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.







Re: Analysis on JDK-8022321 java/lang/ref/OOMEInReferenceHandler.java fails intermittently

2013-12-19 Thread David Holmes

On 20/12/2013 12:57 PM, srikalyan chandrashekar wrote:

Hi David Thanks for your comments, the unguarded part(clean and enqueue)
in the Reference Handler thread does not seem to create any new objects,
so it is the application(the test in this case) which is adding objects
to heap and causing the Reference Handler to die with OOME.


The ReferenceHandler thread can only get OOME if it allocates (directly 
or indirectly) - so there has to be something in the unguarded part that 
causes this. Again it may be an implicit action in the VM - similar to 
the class load issue for InterruptedException.


David

I am still

unsure about the side effects of the code change and agree with your
thoughts(on memory exhaustion test's reliability).

PS: hotspot dev alias removed from CC.

--
Thanks
kalyan

On 12/19/13 5:08 PM, David Holmes wrote:

Hi Kalyan,

This is not a hotspot issue so I'm moving this to core-libs, please
drop hotspot from any replies.

On 20/12/2013 6:26 AM, srikalyan wrote:

Hi all,  I have been working on the bug JDK-8022321
https://bugs.openjdk.java.net/browse/JDK-8022321 , this is a sporadic
failure and the webrev is available here
http://cr.openjdk.java.net/~srikchan/Regression/JDK-8022321_OOMEInReferenceHandler-webrev/



I'm really not sure what to make of this. We have a test that triggers
an out-of-memory condition but the OOME can actually turn up in the
ReferenceHandler thread causing it to terminate and the test to fail.
We previously accounted for the non-obvious occurrences of OOME due to
the Object.wait and the possible need to load the InterruptedException
class - but still the OOME can appear where we don't want it. So
finally you have just placed the whole for(;;) loop in a
try/catch(OOME) that ignores the OOME. I'm certain that makes the test
happy, but I'm not sure it is really what we want for the
ReferenceHandler thread. If the OOME occurs while cleaning, or
enqueuing then we will fail to clean and/or enqueue but there would be
no indication that has occurred and I think that is a bigger problem
than this test failing.

There may be no way to make this test 100% reliable. In fact I'd
suggest that no memory exhaustion test can be 100% reliable.

David


*
**Root Cause:Still not known*
2 places where there is a possibility for OOME
1) Cleaner.clean()
2) ReferenceQueue.enqueue()

1)  The cleanup code in turn has 2 places where there is potential for
throwing OOME,
 a) thunk Thread which is run from clean() method. This Runnable is
passed to Cleaner and appears in the following classes
 java/nio/DirectByteBuffer.java
 sun/misc/Perf.java
 sun/nio/fs/NativeBuffer.java
 sun/nio/ch/IOVecWrapper.java
 sun/misc/Cleaner/ExitOnThrow.java
However none of the above overridden implementations ever create an
object in the clean() code.
 b) new PrivilegedAction created in try catch Exception block of
clean() method but for this object to be created and to be held
responsible for OOME an Exception(other than OOME) has to be thrown.

2) No new heap objects are created in the enqueue method nor anywhere in
the deep call stack (VM.addFinalRefCount() etc) so this cannot be a
potential cause.

*Experimental change to java.lang.Reference.java* :
- Put one more guard (try catch with OOME block) in the Reference
Handler Thread which may give the Reference Handler a chance to cleanup.
This is fixing the test failure (several 1000 runs with 0 failures)
- Without the above change the test fails atleast 3-5 times for every
1000 run.

*PS*: The code change is to a very critical part of JDK and i am fully
not aware of the consequences of the change, hence seeking expert help
here. Appreciate your time and inputs towards this.