[
https://issues.apache.org/jira/browse/HDDS-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869924#comment-17869924
]
Ethan Rose commented on HDDS-11240:
-----------------------------------
Thanks for the info. You can use {{ozone admin \{om,scm\} finalizationstatus}}
to check the status without looking at the version files on disk. Those files
give the metadata layout version on disk, but the software layout version is in
the version of Ozone that is running, and a cluster is pre-finalized if the
software layout version is larger than the metadata layout version.
I agree we should try to determine if the issue is with locks in general in
JDK17 and this is just the first place we noticed it, or if the issue is
specific to the finalization manager. If we can repro the issue in JDK17 and it
does not repro with the same Ozone version in JDK8 or 11 then we may have to do
a more general investigation of how JDK17 affects our usage of locks.
> High cpu usage on ReadWrite locks in JDK17
> ------------------------------------------
>
> Key: HDDS-11240
> URL: https://issues.apache.org/jira/browse/HDDS-11240
> Project: Apache Ozone
> Issue Type: Bug
> Affects Versions: 1.4.0
> Environment: JDK:
> openjdk 17.0.2 2022-01-18
> OpenJDK Runtime Environment (build 17.0.2+8-86)
> OpenJDK 64-Bit Server VM (build 17.0.2+8-86, mixed mode, sharing)
> Ozone:
> 1.4.0
>
> Reporter: weiming
> Assignee: Tanvi Penumudy
> Priority: Major
> Attachments: flamegraph.profile.html,
> image-2024-07-28-20-17-58-466.png, image-2024-07-30-09-32-16-320.png
>
>
> That will cause threads on the following stack trace to consume a lot of CPU:
> "IPC Server handler 7 on default port 9862" #3994 daemon prio=5 os_prio=0
> cpu=5403833.36ms elapsed=653145.54s tid=0x00007fa03fdd2a00 nid=0x921f9
> runnable [0x00007fa0ca3fd000]
> java.lang.Thread.State: RUNNABLE
> at
> java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry([email protected]/ThreadLocal.java:632)
> at
> java.lang.ThreadLocal$ThreadLocalMap.remove([email protected]/ThreadLocal.java:516)
> at java.lang.ThreadLocal.remove([email protected]/ThreadLocal.java:242)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared([email protected]/ReentrantReadWriteLock.java:430)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared([email protected]/AbstractQueuedSynchronizer.java:1094)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock([email protected]/ReentrantReadWriteLock.java:897)
> at
> org.apache.hadoop.ozone.upgrade.AbstractLayoutVersionManager.needsFinalization(AbstractLayoutVersionManager.java:182)
> at
> org.apache.hadoop.ozone.om.request.validation.ValidationCondition$1.shouldApply(ValidationCondition.java:39)
> at
> org.apache.hadoop.ozone.om.request.validation.RequestValidations.lambda$0(RequestValidations.java:110)
> at
> org.apache.hadoop.ozone.om.request.validation.RequestValidations$$Lambda$839/0x00000008013cda80.test(Unknown
> Source)
>
> [^flamegraph.profile.html]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]