[ 
https://issues.apache.org/jira/browse/BEAM-13073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17448252#comment-17448252
 ] 

Valentyn Tymofieiev edited comment on BEAM-13073 at 11/23/21, 8:27 PM:
-----------------------------------------------------------------------

Actually running from my linux laptop was easier (less credential issues), and 
just worked.

This is the job running against current master (no changes): 
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_11_20_07
-11082418496147790215?project=apache-beam-testing  

{noformat}
                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                   112.861
dataflow_v2_java11_total_bytes_count                     2.0E9
{noformat}

Job with #15739 reverted:  
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_12_06_48-5893170135615925417?project=apache-beam-testing

{noformat}
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..70335d328a 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,7 +154,6 @@ func main() {
 
        args := []string{
                "-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
-               "-XX:+AlwaysActAsServerClassMachine",
                "-XX:-OmitStackTraceInFastThrow",
                "-cp", strings.Join(cp, ":"),
        }

                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                    84.425
dataflow_v2_java11_total_bytes_count                     2.0E9

{noformat}

Job with #16045 (fix forward): 
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_11_36_08-1288932146243405653?project=apache-beam-testing
       

{noformat}
:beam$ git diff HEAD^
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..5d120fc777 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,6 +154,9 @@ func main() {
 
        args := []string{
                "-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
+               // ParallelGC the most adequate for high throughput and lower 
CPU utilization
+               // It is the default GC in Java 8, but not on newer versions
+               "-XX:+UseParallelGC",
                "-XX:+AlwaysActAsServerClassMachine",
                "-XX:-OmitStackTraceInFastThrow",
                "-cp", strings.Join(cp, ":"),

                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                     85.06
dataflow_v2_java11_total_bytes_count                     2.0E9

{noformat}


was (Author: tvalentyn):
Actually running from my linux laptop was easier (less credential issues), and 
just worked.

This is the job running against current master (no changes): 
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_11_20_07
-11082418496147790215?project=apache-beam-testing  

{noformat}
                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                   112.861
dataflow_v2_java11_total_bytes_count                     2.0E9
{noformat}

Job with #15739 reverted:  
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_12_06_48
-5893170135615925417?project=apache-beam-testing

{noformat}
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..70335d328a 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,7 +154,6 @@ func main() {
 
        args := []string{
                "-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
-               "-XX:+AlwaysActAsServerClassMachine",
                "-XX:-OmitStackTraceInFastThrow",
                "-cp", strings.Join(cp, ":"),
        }

                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                    84.425
dataflow_v2_java11_total_bytes_count                     2.0E9

{noformat}

Job with #16045 (fix forward): 
https://console.cloud.google.com/dataflow/jobs/us-central1/2021-11-23_11_36_08
-1288932146243405653?project=apache-beam-testing       

{noformat}
:beam$ git diff HEAD^
diff --git a/sdks/java/container/boot.go b/sdks/java/container/boot.go
index 429a88a4bb..5d120fc777 100644
--- a/sdks/java/container/boot.go
+++ b/sdks/java/container/boot.go
@@ -154,6 +154,9 @@ func main() {
 
        args := []string{
                "-Xmx" + strconv.FormatUint(heapSizeLimit(info), 10),
+               // ParallelGC the most adequate for high throughput and lower 
CPU utilization
+               // It is the default GC in Java 8, but not on newer versions
+               "-XX:+UseParallelGC",
                "-XX:+AlwaysActAsServerClassMachine",
                "-XX:-OmitStackTraceInFastThrow",
                "-cp", strings.Join(cp, ":"),

                 Metric:                    Value:
dataflow_v2_java11_runtime_sec                     85.06
dataflow_v2_java11_total_bytes_count                     2.0E9

{noformat}

> Unexpected GC when using Java 11
> --------------------------------
>
>                 Key: BEAM-13073
>                 URL: https://issues.apache.org/jira/browse/BEAM-13073
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-harness
>            Reporter: Luis
>            Assignee: Kenneth Knowles
>            Priority: P1
>              Labels: java11, java9, performance
>             Fix For: 2.35.0
>
>         Attachments: perf_regression_java_11.png
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Beam SDK has been supporting Java 11 for a while (I guess the support was 
> introduced here https://issues.apache.org/jira/browse/BEAM-2530). 
> Unfortunately, in Spotify we are still experiencing performance issues when 
> using Beam SDK 2.32, Google Dataflow and Java 11.
> Thanks to [~emilyye] and [~iht], they confirmed JVM 11 is using SerialGC, 
> while Java 8 uses ParallelGC. It sounds like ParallelGC is a good option for 
> high throughput / low latency jobs. For Java11 we'd expect to use G1GC or 
> ParallelGC.
> This SO question [1] clarifies that JVM chooses SerialGC when it treats the 
> machine as a "client". It looks like the Java SDK container could benefit 
> from using `-XX:+AlwaysActAsServerClassMachine`. Is that correct?
> Let me know if the ticket needs further context or adjustment. (It is my 
> first time creating a ticket here).
>  [1] 
> [https://stackoverflow.com/questions/52474162/why-is-serialgc-chosen-over-g1gc]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to