Hi Prasanth,

On 05.06.20 18:44, Prasanth Mathialagan wrote:
Hi,
We recently switched our Java application from CMS to G1. Since then we observed increased CPU time (user cpu) and latency for the requests.

*/_Observations_/*

  * Count of GC pauses remains the same with CMS and G1 and so does the
    pause time.
  * My initial suspicion was that the application threads were competing
    with GC threads to get CPU cycles. But I don't see any indication of
    increased concurrent time in GC logs.

I suspect that the overhead associated with read/write barriers could be the reason for the increased CPU cycles but I want to confirm that. *Are there any GC flags that prints statistics about read/write barriers? Or is there a way to debug this?*


The significantly larger write barriers (there are almost no read barriers in g1) can have an effect as you describe, although I would not expect a direct impact on latency.

There is no statistics gathering option to be enabled on actual impact of write barriers as they are too small to measure by themselves without huge overhead. Tracing throughput deficiencies back to barriers is mostly deduced by elimination of all other causes.

java -version

openjdk version "1.8.0_222"

In later JDKs the amount of applications where G1 improves upon CMS broadens. There will also always be some applications where CMS is very hard to beat in terms of your desired throughput/latency. Particularly ones where the application and options were previously tuned to CMS.


OpenJDK Runtime Environment Corretto-8.222.10.1 (build 1.8.0_222-b10)

OpenJDK 64-Bit Server VM Corretto-8.222.10.1 (build 25.222-b10, mixed mode)


These are the command line flags I find in GC logs that the application uses.

Thanks. Some thoughts on the options, not sure if you spent time on tuning them to G1, but if not it might be useful to reconsider some of the GC specific ones.

-XX:+UseG1GC
-XX:CICompilerCount=3
-XX:CompressedClassSpaceSize=931135488

-XX:ConcGCThreads=1

Not sure if that makes a lot of sense to slow down concurrent operation, but it might help eeke you last throughput. Note that in jdk8 scalability of marking in g1 isn't that great, but that typically only has impact if you are in the tens of threads.

-XX:G1HeapRegionSize=4194304

That should be automatically selected with given initial/max heap size.

-XX:InitialCodeCacheSize=402653184
-XX:InitialHeapSize=8589934592
-XX:InitialTenuringThreshold=6
-XX:InitiatingHeapOccupancyPercent=50

If you increase the number of conc gc threads, you might be able to increase this one to decrease the frequency of (old gen) collections. 50% seems pretty low on a 8g heap. (That also applies to CMS I think).

-XX:MarkStackSize=4194304

Curious about the reason for that? Afaik even in jdk8, while G1 reserves a lot of memory for that, it will not be allocated by the OS anyway unless used; I think other collectors are the same.

-XX:MaxGCPauseMillis=200

Default.

-XX:MaxHeapSize=8589934592
-XX:MaxMetaspaceSize=939524096
-XX:MaxNewSize=5150605312
-XX:MaxTenuringThreshold=6

Not sure if that potentially prematurely pushing objects into old gen is a good idea, but I assume you tested that.

-XX:MetaspaceSize=268435456
-XX:MinHeapDeltaBytes=4194304
-XX:+ParallelRefProcEnabled
[...lots of Print options...]
-XX:ReservedCodeCacheSize=402653184
-XX:+ScavengeBeforeFullGC

That last one never had any effect in G1 afair.

-XX:SoftRefLRUPolicyMSPerMB=2048
-XX:StackShadowPages=20
-XX:ThreadStackSize=512
-XX:+TieredCompilation
-XX:+UseBiasedLocking
-XX:+UseCompressedClassPointers
-XX:+UseCompressedOops
-XX:+UseFastAccessorMethods
-XX:+UseLargePages
-XX:+UseTLAB

Given that you set initial and max heap size the same, use large pages, I recommend to add -XX:+AlwaysPreTouch.


Let me know if I need to provide any other information.

Sorry for not being a great help.

Thanks,
  Thomas
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
https://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

Reply via email to