Jeremy,

Closing the loop on this part of the thread. I don't think there are
any more leaks left after the fix is applied. Here are the entries
I added the "public comments" section of the bug. For some reason
that bug is _still_ not showing up on the OpenJDK site.

=== *Public Comments* ========================================================
I ran the AnonLoggerWeakRefLeak test with the following env variables:

TESTJAVA=c:/work/shared/mirrors/jdks-win32/jdk/1.7.0/latest/binaries/windows-i586
TESTCLASSES=C:/cygwin/home/dcubed/Projects/6942989/XXX
TESTVMOPTS=-showversion -client -Xmx32m
TESTSRC=C:/cygwin/home/dcubed/Projects/6942989/XXX

along with a 30 minute duration (1800 seconds) using JDK7-B99 bits
(which does NOT have the fix).

Here's the last 15 lines of the log files:

$ tail -15 AnonLogger.jdk7-B99_30min.log
INFO: instance_cnt = 1032505
INFO: instance_cnt = 1038305
INFO: instance_cnt = 1044005
INFO: instance_cnt = 1049705
INFO: jmap exited with exit code = 1
INFO: The likely reason is that AnonLoggerWeakRefLeak has finished running.
INFO: increasing_cnt = 181
INFO: decreasing_cnt = 0
INFO: The instance count of java.lang.ref.WeakReference objects
INFO: is always increasing.
FAIL: This indicates that there is a memory leak.

real    18m10.296s
user    0m41.570s
sys     0m46.442s

$ tail -15 AnonLoggerWeakRefLeak.jdk7-B99_30min.log
INFO: call count = 1044000
INFO: call count = 1045000
INFO: call count = 1046000
INFO: call count = 1047000
INFO: call count = 1048000
INFO: call count = 1049000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2239)
        at java.util.Arrays.copyOf(Arrays.java:2213)
        at java.util.ArrayList.grow(ArrayList.java:208)
        at java.util.ArrayList.ensureCapacity(ArrayList.java:182)
        at java.util.ArrayList.add(ArrayList.java:406)
        at java.util.logging.Logger.doSetParent(Logger.java:1401)
        at java.util.logging.Logger.getAnonymousLogger(Logger.java:374)
        at AnonLoggerWeakRefLeak.main(AnonLoggerWeakRefLeak.java:60)


So it took a little over 18 minutes for a Client VM with a 32MB
max heap size to throw an OOME with AnonLoggerWeakRefLeak.


I ran the LoggerWeakRefLeak test with the same environment variables
along with a 30 minute duration (1800 seconds) using JDK7-B99 bits
(which does NOT have the fix).

Here's the last 15 lines of the log files:

$ tail -15 Logger.jdk7-B99_30min.log
INFO: instance_cnt = 935705
INFO: instance_cnt = 938605
INFO: instance_cnt = 941205
INFO: instance_cnt = 943205
INFO: jmap exited with exit code = 1
INFO: The likely reason is that LoggerWeakRefLeak has finished running.
INFO: increasing_cnt = 160
INFO: decreasing_cnt = 0
INFO: The instance count of java.lang.ref.WeakReference objects
INFO: is always increasing.
FAIL: This indicates that there is a memory leak.

real    16m47.690s
user    0m37.910s
sys     0m40.002s

$ tail -15 LoggerWeakRefLeak.jdk7-B99_30min.log
INFO: call count = 881000
INFO: call count = 882000
INFO: call count = 883000
INFO: call count = 884000
INFO: call count = 885000
INFO: call count = 886000
INFO: call count = 887000
INFO: call count = 888000
INFO: call count = 889000
INFO: call count = 890000
INFO: call count = 891000
INFO: call count = 892000
INFO: call count = 893000
INFO: call count = 894000
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space


So it took a little less than 17 minutes for a Client VM with a 32MB
max heap size to throw an OOME with LoggerWeakRefLeak.

*** (#1 of 3): 2010-06-28 22:45:03 GMT+00:00 [email protected]
*** Last Edit: 2010-06-28 22:57:39 GMT+00:00 [email protected]

I ran the AnonLoggerWeakRefLeak test with the same environment
variables for a 60 minute duration using the fixed bits from
JPRT. Here are the last 15 lines of each log (much less
interesting info):

$ tail -15 AnonLogger.jdk7-B100+_60min.log
INFO: instance_cnt = 5801
INFO: instance_cnt = 5901
INFO: instance_cnt = 5701
INFO: instance_cnt = 5601
INFO: jmap exited with exit code = 1
INFO: The likely reason is that AnonLoggerWeakRefLeak has finished running.
INFO: increasing_cnt = 227
INFO: decreasing_cnt = 383
INFO: The instance count of java.lang.ref.WeakReference objects
INFO: is both increasing and decreasing.
PASS: This indicates that there is not a memory leak.

real    60m5.099s
user    2m18.882s
sys     2m31.127s

$ tail -15 AnonLoggerWeakRefLeak.jdk7-B100+_60min.log
INFO: call count = 3484000
INFO: call count = 3485000
INFO: call count = 3486000
INFO: call count = 3487000
INFO: call count = 3488000
INFO: call count = 3489000
INFO: call count = 3490000
INFO: call count = 3491000
INFO: call count = 3492000
INFO: call count = 3493000
INFO: call count = 3494000
INFO: call count = 3495000
INFO: call count = 3496000
INFO: call count = 3497000
INFO: final loop count = 3497200


I ran the LoggerWeakRefLeak test with the same environment
variables for a 60 minute duration using the fixed bits from
JPRT. Here are the last 15 lines of each log (much less
interesting info):

$ tail -15 Logger.jdk7-B100+_60min.log
INFO: instance_cnt = 1813
INFO: instance_cnt = 1638
INFO: instance_cnt = 1514
INFO: instance_cnt = 1511
INFO: jmap exited with exit code = 1
INFO: The likely reason is that LoggerWeakRefLeak has finished running.
INFO: increasing_cnt = 293
INFO: decreasing_cnt = 318
INFO: The instance count of java.lang.ref.WeakReference objects
INFO: is both increasing and decreasing.
PASS: This indicates that there is not a memory leak.

real    60m6.783s
user    2m20.592s
sys     2m31.997s

$ tail -15 LoggerWeakRefLeak.jdk7-B100+_60min.log
INFO: call count = 3502000
INFO: call count = 3503000
INFO: call count = 3504000
INFO: call count = 3505000
INFO: call count = 3506000
INFO: call count = 3507000
INFO: call count = 3508000
INFO: call count = 3509000
INFO: call count = 3510000
INFO: call count = 3511000
INFO: call count = 3512000
INFO: call count = 3513000
INFO: call count = 3514000
INFO: call count = 3515000
INFO: final loop count = 3515800

*** (#2 of 3): 2010-06-29 01:46:49 GMT+00:00 [email protected]

I ran the AnonLoggerWeakRefLeak test with the same environment
variables for a 12 hour duration using the fixed bits from JPRT.
For this run I saved the "Total" line from the jmap output from
every 10th sample:

$ diff ../AnonLoggerWeakRefLeak.sh AnonLoggerWeakRefLeak.sh
161a162
> > "$TEST_NAME.totals"
225a227,233
> > set +e
> mod=`expr "$loop_cnt" % 10`
> set -e
> if [ "$mod" = 0 ]; then
> tail -1 "$TEST_NAME.jmap" >> "$TEST_NAME.totals"
> fi

Here is an analysis of the .totals data:

$ sh analyze_totals < AnonLoggerWeakRefLeak.totals
          #objs      #bytes
first:    30537     2243648
lo:       30537     2243648
hi:       57072     3197904
last:     35676     2882488
avg:      36853     2929982
# samples: 647

The first sample is also the lowest set of values which isn't
a surprise given that the first sample is taken shortly after
the target program has started running. The hi value occurred
in sample #269 of 647 and the last sample was below the average.
This data indicates that the values are both rising and falling
over time which does not indicate any more memory leaks.



I also ran the LoggerWeakRefLeak test with the same environment
variables for a 12 hour duration using the fixed bits from JPRT.
For this run I saved the "Total" line from the jmap output from
every 10th sample:

$ diff ../LoggerWeakRefLeak.sh LoggerWeakRefLeak.sh
161a162
> > "$TEST_NAME.totals"
225a227,233
> > set +e
> mod=`expr "$loop_cnt" % 10`
> set -e
> if [ "$mod" = 0 ]; then
> tail -1 "$TEST_NAME.jmap" >> "$TEST_NAME.totals"
> fi

Here is an analysis of the .totals data:

$ sh analyze_totals < LoggerWeakRefLeak.totals
          #objs      #bytes
first:    48957     2832648
lo:       48957     2832648
hi:      148299     6299016
last:    133173     5451032
avg:     137578     5608654
# samples: 647

The first sample is also the lowest set of values which isn't
a surprise given that the first sample is taken shortly after
the target program has started running. The hi value occurred
in sample #76 of 647 and the last sample was below the average.
This data indicates that the values are both rising and falling
over time which does not indicate any more memory leaks.

*** (#3 of 3): 2010-06-29 15:59:39 GMT+00:00 [email protected]


Please let me know if the fix is not working for you in your
environment at Google.

Dan


On 6/23/2010 6:38 PM, Daniel D. Daugherty wrote:
On 6/23/2010 12:38 PM, Jeremy Manson wrote:
Hi Daniel,

I'm sorry I missed this (I heavily filter these lists, and check rarely).

This time I specifically left you on the "To" list rather than
editing down to just the list aliases.


My main feeling is that you are missing a good bet by not
reconstructing the Hashtable in LogManager and the ArrayList in Logger
every so often when you remove the loggers.  In a test case where
there are a LOT of short-lived loggers, the backing array for the
Hashtable can get very big.  It is permanent, and doesn't go anywhere.
 You can end up with a lot of extra memory lying around that way.

It's possible that is a good bet. However, that wasn't mentioned
as one of the issues in either bug report (I think) so I didn't
write a test for that.


Specifically, when I didn't reconstruct those data structures, the
test case listed in the bug (where it just creates lots and lots of
anonymous loggers) killed the Java instance with an OOM, even if I
*did* clean up the weakreferences to the loggers.

I'll create a variant of the anon logger test that I put in the
changeset and checkout if I can kill the Java instance with an OOM.
I'll keep you posted.


I'm assuming you have a customer waiting for this - if that is similar
to their usage pattern, this fix may not fix their problem.

Thanks for the heads up. The JDK6 version of the fix hasn't been tested
by the customer yet so you might be right.


You obviously don't want to rebuild those structures every time,
though.  What I did in my change was to reconstruct the backing data
structures every time ~as many loggers were collected as were present
in the data structure.

Yup. I caught that part of the rebuild algorithm. It's just that the
reason for doing the rebuild didn't jump out at me.

Dan


Jeremy

On Fri, Jun 18, 2010 at 12:25 PM, Daniel D. Daugherty
<[email protected]> wrote:
Greetings,

I have a new version of my fix for the WeakReference leak in the
Logging API done. This version uses ReferenceQueues; thanks to Eamonn
McManus, Jeremy Manson and Tony Printezis for their insights on using
ReferenceQueues. Here's a pointer to Tony's paper for background info:

   http://java.sun.com/developer/technicalArticles/javase/finalization/

This version also has limits on the number of dead Loggers that are
cleaned up per call; thanks to Alan Bateman for politely pushing me in
that direction.

The webrev is again relative to OpenJDK7, but the bug is escalated so
the fix will be backported to the JDK6-Update train. So again, I'll
need a minimum of two code reviewers.

Here is the URL for the webrev:

   http://cr.openjdk.java.net/~dcubed/6942989-webrev/1/

Thanks, in advance, for any reviews.

Dan



Reply via email to