Hi Eric,
Thanks for the reply. Yeah, we're experimenting with reducing the
initiating heap occupancy to 55%, hopefully letting the concurrent
marking complete before the allocation and promotion rate results in a
Full GC. In particular, I'm not sure how to interpret the gc log
snippets I posted showing a young evac with an empty Young gen, and
then followed by a Full GC due to heap expansion failure for a 32 byte
allocation. I'm likely missing something in my interpretation.
I think you're also on point about the max tenuring threshold being
too low - need to see how that value was arrived at.
Thanks
On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole <eric.casp...@oracle.com
<mailto:eric.casp...@oracle.com>> wrote:
I have not used G1 in JDK 8 that much but the two trouble spots to
me are:
-XX:InitiatingHeapOccupancyPercent=75
-XX:MaxTenuringThreshold=2
So this will tenure very quickly, filling up old gen and start the
marking relatively late at 75%. This looks like it is pretty
likely to end up in a STW full GC.
Since you do have a huge amount of garbage getting collected in
the full gc maybe try letting more of it die off in young gen with
higher tenuring threshold and also start marking earlier than 75%.
good luck,
Eric
On 08/24/2016 02:43 PM, Vitaly Davidovich wrote:
Hi guys,
Hoping someone could shed some light on G1 behavior (as seen
from the gc
log) that I'm having a hard time understanding. The root
problem is G1
enters a Full GC that takes many tens of seconds, and need
some advice
on what could be causing it.
First, some basic info:
Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE
(1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re"
with gcc
4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 264115728k(108464820k free), swap
0k(0k free)
CommandLine flags: -XX:G1HeapWastePercent=5
-XX:G1MixedGCCountTarget=4
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<some_path>
-XX:InitialCodeCacheSize=104857600
-XX:InitialHeapSize=103079215104
-XX:InitialTenuringThreshold=2
-XX:InitiatingHeapOccupancyPercent=75
-XX:+ManagementServer -XX:MaxGCPauseMillis=300
-XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720
-XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720
-XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy
-XX:+PrintCommandLineFlags -XX:+PrintCompilation
-XX:PrintFLSStatistics=1 -XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintPromotionFailure -XX:+PrintReferenceGC
-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1
-XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600
-XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC
Swap is disabled. THP is disabled.
First issue I have a question about:
2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1
Evacuation Pause)
(young)
Desired survivor size 1795162112 bytes, new threshold 2 (max 2)
17776.029: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 0, predicted base time: 14.07 ms, remaining
time: 285.93
ms, target pause time: 300.00 ms]
17776.029: [G1Ergonomics (CSet Construction) add young
regions to CSet,
eden: 0 regions, survivors: 0 regions, predicted young region
time: 0.00 ms]
17776.029: [G1Ergonomics (CSet Construction) finish choosing
CSet,
eden: 0 regions, survivors: 0 regions, old: 0 regions,
predicted pause
time: 14.07 ms, target pause time: 300.00 ms]
2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs,
0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034:
[WeakReference,
0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035:
[FinalReference, 0 refs, 0.0007027
secs]2016-08-24T15:29:12.308+0000:
17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585
secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference,
0.0000118 secs], 0.0089758 secs]
[Parallel Time: 3.1 ms, GC Workers: 23]
[GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3,
Max:
17776029.4, Diff: 0.2]
[Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8,
Diff: 1.9,
Sum: 24.2]
[Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
0.0, Sum: 0.1]
[Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff:
1, Sum: 2]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
Sum: 0.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0,
Diff: 0.0,
Sum: 0.0]
[Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff:
0.1, Sum: 1.2]
[Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff:
1.8, Sum: 37.9]
[Termination Attempts: Min: 1, Avg: 1.0, Max: 1,
Diff: 0, Sum: 23]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0,
Diff: 0.0,
Sum: 0.2]
[GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9,
Diff: 0.2,
Sum: 63.8]
[GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max:
17776032.1, Diff: 0.0]
[Code Root Fixup: 0.2 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.4 ms]
[Other: 5.3 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 4.4 ms]
[Ref Enq: 0.3 ms]
[Redirty Cards: 0.3 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.1 ms]
[Free CSet: 0.0 ms]
*[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
95.2G(96.0G)->95.2G(96.0G)]*
[Times: user=0.08 sys=0.00, real=0.01 secs]
2016-08-24T15:29:12.311+0000: 17776.038: Total time for which
application threads were stopped: 0.0103002 seconds, Stopping
threads
took: 0.0000566 seconds
17776.039: [G1Ergonomics (Heap Sizing) attempt heap
expansion, reason:
allocation request failed, allocation request: 32 bytes]
17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested
expansion amount: 33554432 bytes, attempted expansion amount:
33554432
bytes]
17776.039: [G1Ergonomics (Heap Sizing) did not expand the
heap, reason:
heap already fully expanded]
2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation
Failure)
2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504
refs,
0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456:
[WeakReference,
1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456:
[FinalReference, 3270 refs, 0.0033290
secs]2016-08-24T15:29:40.732+0000:
17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257
secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference,
0.0000172 secs] 95G->38G(96G), 95.5305034 secs]
[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
95.2G(96.0G)->38.9G(96.0G)], [Metaspace:
104180K->103365K(106496K)]
* [Times: user=157.02 sys=0.28, real=95.54 secs] *
*
*
So here we have a lengthy full GC pause that collects quite a
bit of old
gen (expected). Right before this is a young evac pause.
Why is the heap sizing (bolded) reported after the evac pause
showing
empty Eden+Survivor?
Why is ergonomic info reporting 0 regions selected (i.e. what's
evacuated then)?
Right before the Full GC, ergonomics report a failure to
expand the heap
due to an allocation request of 32 bytes. Is this implying that a
mutator tried to allocate 32 bytes but couldn't? How do I
reconcile that
with Eden+Survivor occupancy reported right above that?
Young gen is sized to 30GB, total heap is 96GB. Allocation
rate of the
application is roughly 1GB/s. Am I correct in assuming that
allocation
is outpacing concurrent marking, based on the above? What
tunable(s)
would you advise to tweak to get G1 to keep up with the
allocation rate?
I'm ok taking some throughput hit to mitigate 90s+ pauses.
Let me know if any additional info is needed (I have the full
GC log,
and can attach that if desired).
Thanks
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
<mailto:hotspot-gc-use@openjdk.java.net>
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
<http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
<mailto:hotspot-gc-use@openjdk.java.net>
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
<http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use