[
https://issues.apache.org/jira/browse/HDDS-8731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727742#comment-17727742
]
Tsz-wo Sze commented on HDDS-8731:
----------------------------------
Sure. Let's use the JvmPauseMonitor from Ratis. We recently improved the
message slightly in RATIS-1788:
Old message
{quote}Detected pause in JVM or host machine (eg GC): pause of approximately
1393794624ns. No GCs detected.
{quote}
New message
{quote}Detected pause in JVM or host machine approximately 1.393s without any
GCs.
{quote}
The old message with "... (eg GC) ... No GCs detected" confuse some people –
They often think that a pause must be caused by GCs.
> Standardize JVM pause monitor
> -----------------------------
>
> Key: HDDS-8731
> URL: https://issues.apache.org/jira/browse/HDDS-8731
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Attila Doroszlai
> Priority: Major
>
> Ozone uses JvmPauseMonitor from both Hadoop and Ratis. I think we should
> standardize on the one from Ratis:
> * It is automatically started for Ratis servers.
> * It has better output:
> ** more compact (single line)
> ** includes ID information
> * Ratis has more frequent releases, so it may be easier to make improvements
> or fixes, if needed, at the source.
> {code:title=Ratis}
> scm_1 | 2023-04-22 01:58:49,875 [JvmPauseMonitor0] INFO
> util.JvmPauseMonitor: JvmPauseMonitor-58db5753-a6a0-4ffc-aac9-e0d1221cc5e9:
> Started
> ...
> scm_1 | 2023-04-22 01:59:26,776 [JvmPauseMonitor0] WARN
> util.JvmPauseMonitor: JvmPauseMonitor-58db5753-a6a0-4ffc-aac9-e0d1221cc5e9:
> Detected pause in JVM or host machine (eg GC): pause of approximately
> 1393794624ns. No GCs detected.
> {code}
> {code:title=Hadoop}
> scm_1 | 2023-04-22 01:58:57,031
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@45e931a9] INFO
> util.JvmPauseMonitor: Starting JVM pause monitor
> ...
> scm_1 | 2023-04-22 01:59:26,933
> [org.apache.hadoop.util.JvmPauseMonitor$Monitor@45e931a9] INFO
> util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of
> approximately 1393ms
> scm_1 | No GCs detected
> {code}
> Proposed changes:
> # Remove usage of Hadoop JvmPauseMonitor
> # Add explicit usage of Ratis JvmPauseMonitor
> #* for services without any Ratis server (HttpFS, Recon, S3 Gateway)
> #* for services that have optional Ratis server if Ratis is disabled (OM, SCM)
> CC [~szetszwo]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]