rnblough opened a new pull request, #9967: URL: https://github.com/apache/ozone/pull/9967
## What changes were proposed in this pull request? I propose that the NewRatio value be specified in java options of all Ozone roles by default when -XX:+UseConcMarkSweepGC is set, to solve the long-standing problem of ConcurrentMarkSweep GC always having a tiny Young Generation heap size. The consequence of the tiny Young Generation heap size is ParNew thrashing, and premature object promotion polluting the Old Gen and eventually driving unnecessary full GC. That part of the problem is straightforwardly diagnosable with GC logs and heap dumps, and it has been pretty common in Hadoop deployments generally to address this problem using -XX:NewSize and -XX:MaxNewSize or -Xmn as cluster sizes grew; the fact that there was a consistent underlying driver through JDK ergonomics that can be trivially compensated for is the insight here. This primarily impacts larger deployments, particularly where lists of millions of objects like keys or containerIDs becomes routine even through internal reporting mechanisms. This behavior was introduced deliberately in the JDK ergonomics. The earliest complaints about the behavior I encountered are from JDK6: https://bugs.openjdk.org/browse/JDK-6872335 But it looks like it was actually introduced before that, based on this doc describing GC tuning changes for J2SE 5.0: https://docs.oracle.com/javase/1.5.0/docs/guide/vm/gc-ergonomics.html The choice of -XX:NewRatio=3 instead of the default value of 2 comes down to the observation that Ozone does not require a young generation heap size that is 1/3 of the total heap (among other things, most Ozone deployments have worked fine even with the artificially tiny value), and to the fact that NewRatio will automatically adjust in tandem with heap size adjustments as opposed to something like -Xmn that would need to be recalculated every time or left static and require future manual adjustment. Impacts to running clusters: I have observed one occasion where configuring a larger Young Generation heap size did result in ParNew collections taking a substantially longer time, on an SCM where -Xmx200g. This was noticeable when looking at the logs, and was detectable in some client interactions, but there were no further impacts. In every prod cluster I have seen where this change has been implemented from ~100g on down, no negative impacts observed at all. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-14106 ## How was this patch tested? Manual testing, successful deployment in production clusters, build-branch on fork. Two tests failed, but they are not germane to the change and appear to be do to a config issue in the integration (container) setup. org.apache.hadoop.ozone.container.diskbalancer.TestDefaultContainerChoosingPolicy org.apache.hadoop.ozone.container.diskbalancer.TestDefaultVolumeChoosingPolicy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
