[
https://issues.apache.org/jira/browse/BEAM-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448252#comment-17448252
]
Valentyn Tymofieiev commented on BEAM-13073:
--------------------------------------------
Actually running from my linux laptop was easier (less credential issues), and
just worked.
This is the job running against current master (no changes):
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_11_20_07
-11082418496147790215?project=apache-beam-testing
Metric: Value:
dataflow_v2_java11_runtime_sec 112.861
dataflow_v2_java11_total_bytes_count 2.0E9
Job with #15739 reverted:
{noformat}
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..70335d328a 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,7 +154,6 @@ func main() {
args := []string{
"-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
- "-XX:+AlwaysActAsServerClassMachine",
"-XX:-OmitStackTraceInFastThrow",
"-cp", strings.Join(cp, ":"),
}
Metric: Value:
dataflow_v2_java11_runtime_sec 84.425
dataflow_v2_java11_total_bytes_count 2.0E9
{noformat}
Job with #16045 (fix forward)
{noformat}
:beam$ git diff HEAD^
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..5d120fc777 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,6 +154,9 @@ func main() {
args := []string{
"-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
+ // ParallelGC the most adequate for high throughput and lower
CPU utilization
+ // It is the default GC in Java 8, but not on newer versions
+ "-XX:+UseParallelGC",
"-XX:+AlwaysActAsServerClassMachine",
"-XX:-OmitStackTraceInFastThrow",
"-cp", strings.Join(cp, ":"),
Metric: Value:
dataflow_v2_java11_runtime_sec 85.06
dataflow_v2_java11_total_bytes_count 2.0E9
{noformat}
> Unexpected GC when using Java 11
> --------------------------------
>
> Key: BEAM-13073
> URL: https://issues.apache.org/jira/browse/BEAM-13073
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-harness
> Reporter: Luis
> Assignee: Kenneth Knowles
> Priority: P1
> Labels: java11, java9, performance
> Fix For: 2.35.0
>
> Attachments: perf_regression_java_11.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Beam SDK has been supporting Java 11 for a while (I guess the support was
> introduced here https://issues.apache.org/jira/browse/BEAM-2530).
> Unfortunately, in Spotify we are still experiencing performance issues when
> using Beam SDK 2.32, Google Dataflow and Java 11.
> Thanks to [~emilyye] and [~iht], they confirmed JVM 11 is using SerialGC,
> while Java 8 uses ParallelGC. It sounds like ParallelGC is a good option for
> high throughput / low latency jobs. For Java11 we'd expect to use G1GC or
> ParallelGC.
> This SO question [1] clarifies that JVM chooses SerialGC when it treats the
> machine as a "client". It looks like the Java SDK container could benefit
> from using `-XX:+AlwaysActAsServerClassMachine`. Is that correct?
> Let me know if the ticket needs further context or adjustment. (It is my
> first time creating a ticket here).
> [1]
> [https://stackoverflow.com/questions/52474162/why-is-serialgc-chosen-over-g1gc]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)