[ 
https://issues.apache.org/jira/browse/HDDS-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuoHao updated HDDS-10868:
--------------------------
    Description: 
When we balance in a comparative cluster, there are often long-term GC issues 
with SCM, resulting in re election of SCM HA, such as:

The number of containers for a single data node is around 300w. When in 
balance, SCM measures GC logs
{code:java}
2024-05-16 20:58:56,078 [JvmPauseMonitor0] WARN 
org.apache.ratis.util.JvmPauseMonitor: 
JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM or 
host machine approximately 1.771s with 1.831s GC time.
2024-05-16 21:00:23,701 [JvmPauseMonitor0] WARN 
org.apache.ratis.util.JvmPauseMonitor: 
JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM or 
host machine approximately 24.111s with 24.095s GC time.
2024-05-16 21:00:24,809 [JvmPauseMonitor0] WARN 
org.apache.ratis.util.JvmPauseMonitor: 
JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM or 
host machine approximately 0.108s without any GCs.
2024-05-16 21:01:56,334 [JvmPauseMonitor0] WARN 
org.apache.ratis.util.JvmPauseMonitor: 
JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM or 
host machine approximately 21.511s with 21.500s GC time.
2024-05-16 21:06:30,957 [JvmPauseMonitor0] WARN 
org.apache.ratis.util.JvmPauseMonitor: 
JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM or 
host machine approximately 142.648s with 142.984s GC time. {code}

> Reduce Container Balance memory usage
> -------------------------------------
>
>                 Key: HDDS-10868
>                 URL: https://issues.apache.org/jira/browse/HDDS-10868
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: GuoHao
>            Priority: Major
>
> When we balance in a comparative cluster, there are often long-term GC issues 
> with SCM, resulting in re election of SCM HA, such as:
> The number of containers for a single data node is around 300w. When in 
> balance, SCM measures GC logs
> {code:java}
> 2024-05-16 20:58:56,078 [JvmPauseMonitor0] WARN 
> org.apache.ratis.util.JvmPauseMonitor: 
> JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM 
> or host machine approximately 1.771s with 1.831s GC time.
> 2024-05-16 21:00:23,701 [JvmPauseMonitor0] WARN 
> org.apache.ratis.util.JvmPauseMonitor: 
> JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM 
> or host machine approximately 24.111s with 24.095s GC time.
> 2024-05-16 21:00:24,809 [JvmPauseMonitor0] WARN 
> org.apache.ratis.util.JvmPauseMonitor: 
> JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM 
> or host machine approximately 0.108s without any GCs.
> 2024-05-16 21:01:56,334 [JvmPauseMonitor0] WARN 
> org.apache.ratis.util.JvmPauseMonitor: 
> JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM 
> or host machine approximately 21.511s with 21.500s GC time.
> 2024-05-16 21:06:30,957 [JvmPauseMonitor0] WARN 
> org.apache.ratis.util.JvmPauseMonitor: 
> JvmPauseMonitor-87cacbd5-f305-495e-8c1c-0645ddc29ca7: Detected pause in JVM 
> or host machine approximately 142.648s with 142.984s GC time. {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to