Re: Why does the G1 GC shrink number of regions?

Gil Tene Sat, 03 Jun 2017 15:32:12 -0700

I think that the cause and effect are backwards here... The application 
caused this behavior directly, and what it does is "normal" and "typical". 
Given the Stop-the-world nature of G1's newgen collector, you actually "got 
off easy", and saw a pause of "only" 1.8 seconds...

The significant thing about the first collection with the lower region 
count [the 2nd collection in your post] is the fact that a large amount of 
recently allocated objects are alive at the time of collection. 
Specifically, ~1421MB [9920MB - (33.2GB - 24.9GB)] out of the 9920MB 
allocated in the ~10 seconds preceding the 1.8 second pause, of which 
1280MB was relocated to survivors, and the rest (~140MB) was presumably 
promoted. This is dramatically different behavior compared to the previous 
collection cycles (where only ~64MB out of ~117GB were alive at the time of 
collection). You can see in this application behavior continue through the 
next ~130 seconds (looking at the log file link): The next cycle after the 
"different": behavior started covered an allocation period of ~30 seconds, 
during which 8760MB were newly allocated, and of which ~1158MB stayed alive 
[8760MB - (33.4GB - 26.0GB)]. The next cycle after that covered an 
allocation period of ~100 seconds, during which 8736MB were newly 
allocated, and of which ~646MB stayed alive [8736MB - (34.5GB -26.6GB)].

We are looking at a period of time in the application execution that spans 
~140seconds, during which a total of ~27416MB were newly allocated, and of 
which ~3225MB stuck around long enough to be copied around and promoted. 
This ~3.15GB of new [non ephemerally dying] state is very real, it's just 
not "typical" of other times in the log for the same application.

This sexplanation for this is usually quite simple: it suggests a 
phase-change in application behavior, with a large amount of new state that 
is being kept for a while. This is not "wrong" in any way. In fact, it it 
extremely common and typical to see such phase change behaviors across many 
application types.  E.g. a refresh of a cache, a reloading of a product 
catalog, a reindexing, a loading of data for an analytics computation, a 
rejoining of a cluster, the reaction to SS table compaction, or nodes 
coming and going in Cassandra, Elastic, Zookeeper, etc. etc. etc.

The earlier GC trigger with the smaller region count may very well be a 
reaction to a dramatic change in allocation rate, rather than the cause of 
the large pause. It actually "got lucky" here, In cases where such a phase 
change occurs but the allocation rate change is not "violent enough" to 
heuristically trigger an early newgen collection, the newgen pause on the 
phase change would have been much bigger [this is a common problem with 
large STW newgens, since some phase changes only effect data/state but may 
not change allocation rate dramatically].  In fact, the allocation rate 
going into the 2nd and 3rd collections in this ~140 second period is much 
more relaxes than the one leading to the first collections, and may not 
have been high enough to trigger more aggressive collection oil their own. 
If the collector waited for the whole 140 second before triggering the 
collection in this case, there would be  pause may have taken >2x longer 
because it would have had to deal with moving 3.15GB around in a single 
pause rather than "just" 1.4GB. 

At this point I often get questions form people that go something like 
this: "Isn't G1 supposed to break up these large pauses into smaller 
"increment" pauses?". While G1 does incremental STW compaction in *oldgen* 
(spreading the pause across increments), it's newgen collector is 
effectively monolithic: the whole of newgen must be collected in a single 
STW pause. Newgens are shorter primarily (only?) because they do not have 
to trace all live objects in the heap, and focus on tracing only the newgen 
objects and all roots to them. But this also means that newgens have no way 
to determine the amount of live matter that will be found in the newgen 
until after a monolithic cycle has completed, and to complete it, which 
fundamentally means that the set of pages in the newgen must be selected 
without that knowledge. If the "guess" is right (e.g. "they are mostly 
dead" and "last time only X MB stayed alive"), you get a very efficient 
newgen with a relatively small pause. But if the "guess" was wrong (1421MB 
is found alive alive in 9.9GB of allocation, even tho last time only 64MB 
was found alive in 117GB of allocation, as was the case here), it's too 
late. You're in amor a big pause.

There problem/dillema that you see here, which is inherent in all 
monolithic stop-the-world relocating newgen collectors. For all such 
collectors [at least currently], the only way to reduce the newgen pause in 
the presence of a high amount of newly allocated surviving objects is to 
significantly reduce the newgen size (e.g. trigger at ~1GB of allocation 
rather than after 10GB, in the hope that only 1/10th of the stuff would 
need to be relocated in a single pause). Unfortunately the collector has no 
idea how much newgen material will survive before it starts the collection, 
and it's guesses based on prior collections are dramatically and inherently 
wrong at a phase change. It is stuck with a choice between running very 
inefficient collections (keep newgen at a few 100 MB, even tho you have a 
196GB heap, just in case a large amount of relatively-long-lasting state is 
allocated during a phase change) and accepting a huge pause on a phase 
change. The only ways out of this dilemma is to change how the collector 
deals with newgen: either break the newgen into several (more than 1 or 2) 
incremental-STW sub-parts (something no JVM has done to date, AFAIK), or 
use a concurrent relocating newgen (which by now is a well-proven 
technique).

Note that even during this different phase of operation the weak 
generational hypothesis is holding (85%+ of newly allocated objects have 
died). It's just that outside of such phase shifts, most applications 
demonstrate the weak generational hypothesis in a much more dramatic way 
(with 99%+ death rates, and with the rate growing with larger and larger 
newgens). So the newgen efficiency is still there (compared to dealign with 
the same workload with a non-generational or oldgen collection), but now it 
comes with huge newgen pauses...

To summarize your situation: Basically, the large amount of copying (either 
to survivors within newgen or via promotion to oldgen) is a result of 
program behavior. The collector is just reacting to that behavior. And 
unfortunately for all monolithic STW relocating newgen collectors, there 
isn't much you can do about that. Since several 
newly-allocated-but-surviving GBs need to be relocated in a relatively 
short period of time (10s of seconds), you are going to see pause times 
that amount to ~1-2sec per relocated GB over that period of time. This 
means that you will see significant pause artifacts in your run no matter 
what (as long as you are using a STW newgen).

The actual bottom line is very simple: You need a concurrent newgen to 
handle this workload, with these phase changes, without any of those 
"hiccups". We can discuss that in other posts if you want ;-).

On Friday, June 2, 2017 at 11:39:01 AM UTC-7, Alexandr Nikitin wrote:
>
> Hi all,
>
> I encountered a strange behavior of JVM G1 GC. After a couple of hours of 
> normal work it shrinks the number of regions by 10 times. That causes a lot 
> of copying and promotions, GC times jump dramatically. After a dozen of 
> those unsuccessful collections it brings the number of regions back to the 
> previous normal number. Please help me understand its motivation.
>
> The full GC log: 
> https://gist.githubusercontent.com/alexandrnikitin/6e214dff1ba0f9c4930400c00ecfbdf9/raw/da7c941799e9a359c8ae3f612b503f8afae2dff0/full-strange-gc.log
> The hiccup and healing part only: 
> https://gist.githubusercontent.com/alexandrnikitin/6e214dff1ba0f9c4930400c00ecfbdf9/raw/da7c941799e9a359c8ae3f612b503f8afae2dff0/strange-gc.log
>
> Here's the transition part only. As you can see the first collection went 
> almost as usual  - 180ms (usually 40-50ms) Then G1 decides to reduce number 
> of regions from 3762 to 310. GC took 1.80 secs after that.
> 2017-06-02T18:46:56.010+0200: 10178.264: [GC pause (G1 Evacuation Pause) 
> (young)
> Desired survivor size 7902068736 bytes, new threshold 15 (max 15)
> - age   1:    3862296 bytes,    3862296 total
> - age   2:     459784 bytes,    4322080 total
> - age   3:     587472 bytes,    4909552 total
> - age   4:     383864 bytes,    5293416 total
> - age   5:     399064 bytes,    5692480 total
> - age   6:     363896 bytes,    6056376 total
> - age   7:     323632 bytes,    6380008 total
> - age   8:     357584 bytes,    6737592 total
> - age   9:     323312 bytes,    7060904 total
> - age  10:     328672 bytes,    7389576 total
> - age  11:     986184 bytes,    8375760 total
> - age  12:     222176 bytes,    8597936 total
> - age  13:     188536 bytes,    8786472 total
> - age  14:     207232 bytes,    8993704 total
> - age  15:     120168 bytes,    9113872 total
>  10178.275: [G1Ergonomics (CSet Construction) start choosing CSet, 
> _pending_cards: 834913, predicted base time: 89.27 ms, remaining time: 
> 110.73 ms, target pause time: 200.00 ms]
>  10178.275: [G1Ergonomics (CSet Construction) add young regions to CSet, 
> eden: 3762 regions, survivors: 1 regions, predicted young region time: 8.28 
> ms]
>  10178.275: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 
> 3762 regions, survivors: 1 regions, old: 0 regions, predicted pause time: 
> 97.55 ms, target pause time: 200.00 ms]
> 2017-06-02T18:46:56.121+0200: 10178.375: [SoftReference, 0 refs, 0.0011474 
> secs]2017-06-02T18:46:56.122+0200: 10178.376: [WeakReference, 62 refs, 
> 0.0005960 secs]2017-06-02T18:46:56.123+0200: 10178.377: [FinalReference, 
> 536 refs, 0.0023264 secs]2017-06-02T18:46:56.125+0200: 10178.379: 
> [PhantomReference, 0 refs, 0 refs, 0.0014977 
> secs]2017-06-02T18:46:56.127+0200: 10178.381: [JNI Weak Reference, 
> 0.0000471 secs], 0.1337364 secs]
>    [Parallel Time: 97.7 ms, GC Workers: 28]
>       [GC Worker Start (ms): Min: 10178276.8, Avg: 10178277.6, Max: 
> 10178278.2, Diff: 1.4]
>       [Ext Root Scanning (ms): Min: 0.0, Avg: 0.8, Max: 6.4, Diff: 6.4, 
> Sum: 23.4]
>       [Update RS (ms): Min: 86.6, Avg: 91.4, Max: 91.7, Diff: 5.2, Sum: 
> 2559.4]
>          [Processed Buffers: Min: 104, Avg: 118.4, Max: 143, Diff: 39, 
> Sum: 3315]
>       [Scan RS (ms): Min: 0.7, Avg: 1.0, Max: 1.1, Diff: 0.3, Sum: 26.9]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.3, Diff: 0.3, 
> Sum: 0.7]
>       [Object Copy (ms): Min: 2.3, Avg: 2.5, Max: 3.0, Diff: 0.6, Sum: 
> 71.2]
>       [Termination (ms): Min: 0.0, Avg: 0.4, Max: 0.6, Diff: 0.6, Sum: 
> 12.1]
>          [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 28]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.5, Diff: 0.5, Sum: 
> 7.6]
>       [GC Worker Total (ms): Min: 95.7, Avg: 96.5, Max: 97.3, Diff: 1.6, 
> Sum: 2701.4]
>       [GC Worker End (ms): Min: 10178373.9, Avg: 10178374.1, Max: 
> 10178374.3, Diff: 0.4]
>    [Code Root Fixup: 0.1 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 9.0 ms]
>    [Other: 26.9 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 6.1 ms]
>       [Ref Enq: 0.2 ms]
>       [Redirty Cards: 0.3 ms]
>       [Humongous Register: 1.7 ms]
>       [Humongous Reclaim: 0.7 ms]
>       [Free CSet: 6.5 ms]
>    [Eden: 117.6G(117.6G)->0.0B(117.5G) Survivors: 32.0M->96.0M Heap: 
> 138.2G(196.0G)->20.6G(196.0G)]
>  [Times: user=2.99 sys=0.02, real=0.13 secs] 
> 2017-06-02T18:46:56.144+0200: 10178.398: Total time for which application 
> threads were stopped: 0.2457828 seconds, Stopping threads took: 0.1110667 
> seconds
> 2017-06-02T18:46:56.145+0200: 10178.399: Total time for which application 
> threads were stopped: 0.0006996 seconds, Stopping threads took: 0.0001565 
> seconds
> 2017-06-02T18:46:56.145+0200: 10178.399: Total time for which application 
> threads were stopped: 0.0003017 seconds, Stopping threads took: 0.0000817 
> seconds
> 2017-06-02T18:46:56.146+0200: 10178.400: Total time for which application 
> threads were stopped: 0.0002389 seconds, Stopping threads took: 0.0000372 
> seconds
> 2017-06-02T18:46:56.146+0200: 10178.400: Total time for which application 
> threads were stopped: 0.0002567 seconds, Stopping threads took: 0.0000476 
> seconds
> 2017-06-02T18:46:59.820+0200: 10182.074: Total time for which application 
> threads were stopped: 0.0006631 seconds, Stopping threads took: 0.0001533 
> seconds
> 2017-06-02T18:46:59.820+0200: 10182.074: Total time for which application 
> threads were stopped: 0.0002674 seconds, Stopping threads took: 0.0000527 
> seconds
> 2017-06-02T18:47:06.354+0200: 10188.608: [GC pause (G1 Evacuation Pause) 
> (young)
> Desired survivor size 671088640 bytes, new threshold 15 (max 15)
> - age   1:   49132824 bytes,   49132824 total
> - age   2:    1745176 bytes,   50878000 total
> - age   3:     417632 bytes,   51295632 total
> - age   4:     411672 bytes,   51707304 total
> - age   5:     371712 bytes,   52079016 total
> - age   6:     377456 bytes,   52456472 total
> - age   7:     344840 bytes,   52801312 total
> - age   8:     317216 bytes,   53118528 total
> - age   9:     331216 bytes,   53449744 total
> - age  10:     302656 bytes,   53752400 total
> - age  11:     282424 bytes,   54034824 total
> - age  12:     733032 bytes,   54767856 total
> - age  13:     221808 bytes,   54989664 total
> - age  14:     170104 bytes,   55159768 total
> - age  15:     179872 bytes,   55339640 total
>  10188.608: [G1Ergonomics (CSet Construction) start choosing CSet, 
> _pending_cards: 898670, predicted base time: 105.76 ms, remaining time: 
> 94.24 ms, target pause time: 200.00 ms]
>  10188.608: [G1Ergonomics (CSet Construction) add young regions to CSet, 
> eden: 310 regions, survivors: 3 regions, predicted young region time: 
> 2434.15 ms]
>  10188.608: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 
> 310 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 
> 2539.91 ms, target pause time: 200.00 ms]
> 2017-06-02T18:47:08.146+0200: 10190.400: [SoftReference, 0 refs, 0.0010616 
> secs]2017-06-02T18:47:08.147+0200: 10190.401: [WeakReference, 24 refs, 
> 0.0005880 secs]2017-06-02T18:47:08.148+0200: 10190.402: [FinalReference, 49 
> refs, 0.0006509 secs]2017-06-02T18:47:08.149+0200: 10190.402: 
> [PhantomReference, 0 refs, 1 refs, 0.0010624 
> secs]2017-06-02T18:47:08.150+0200: 10190.404: [JNI Weak Reference, 
> 0.0000308 secs], 1.8028711 secs]
>    [Parallel Time: 1790.9 ms, GC Workers: 28]
>       [GC Worker Start (ms): Min: 10188608.9, Avg: 10188609.3, Max: 
> 10188609.6, Diff: 0.7]
>       [Ext Root Scanning (ms): Min: 0.3, Avg: 0.7, Max: 2.6, Diff: 2.3, 
> Sum: 19.7]
>       [Update RS (ms): Min: 127.3, Avg: 128.9, Max: 129.7, Diff: 2.4, Sum: 
> 3608.7]
>          [Processed Buffers: Min: 89, Avg: 127.7, Max: 180, Diff: 91, Sum: 
> 3576]
>       [Scan RS (ms): Min: 582.3, Avg: 582.9, Max: 583.1, Diff: 0.9, Sum: 
> 16322.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4, 
> Sum: 0.5]
>       [Object Copy (ms): Min: 1073.9, Avg: 1074.5, Max: 1077.6, Diff: 3.7, 
> Sum: 30085.2]
>       [Termination (ms): Min: 0.0, Avg: 3.2, Max: 3.7, Diff: 3.7, Sum: 
> 90.0]
>          [Termination Attempts: Min: 1, Avg: 780.6, Max: 959, Diff: 958, 
> Sum: 21856]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 
> 2.6]
>       [GC Worker Total (ms): Min: 1789.9, Avg: 1790.3, Max: 1790.7, Diff: 
> 0.7, Sum: 50128.8]
>       [GC Worker End (ms): Min: 10190399.5, Avg: 10190399.6, Max: 
> 10190399.7, Diff: 0.2]
>    [Code Root Fixup: 0.1 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 0.7 ms]
>    [Other: 11.2 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 3.7 ms]
>       [Ref Enq: 0.2 ms]
>       [Redirty Cards: 4.2 ms]
>       [Humongous Register: 0.4 ms]
>       [Humongous Reclaim: 0.1 ms]
>       [Free CSet: 0.8 ms]
>    [Eden: 9920.0M(9920.0M)->0.0B(8736.0M) Survivors: 96.0M->1280.0M Heap: 
> 33.2G(196.0G)->24.9G(196.0G)]
>  [Times: user=35.11 sys=3.42, real=1.80 secs] 
>
>
> Thank you!
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Why does the G1 GC shrink number of regions?

Reply via email to