[
https://issues.apache.org/jira/browse/HDFS-17523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863965#comment-17863965
]
ASF GitHub Bot commented on HDFS-17523:
---------------------------------------
slfan1989 commented on code in PR #6890:
URL: https://github.com/apache/hadoop/pull/6890#discussion_r1669463377
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/AutoCloseDataSetLock.java:
##########
@@ -30,17 +33,39 @@
* use a try-with-resource syntax.
*/
public class AutoCloseDataSetLock extends AutoCloseableLock {
- private Lock lock;
+
+ private final Lock lock;
+
+ private final DataNodeLockManager<AutoCloseDataSetLock> dataNodeLockManager;
+
+ private final boolean isReadLock;
+
+ private static ThreadLocal<DataSetLockHeldInfo> heldInfoThreadLocal = new
Review Comment:
Sorry for the delayed response. I've been quite busy recently.
Here's the detailed situation:
In certain versions of JDK 17, there is an issue where weak references to
ThreadLocal cannot be reclaimed(There is a problem with G1, but CMS can reclaim
memory normally). This results in ThreadLocal objects accumulating a large
amount of data, causing significant CPU overhead during traversal and element
retrieval. We initially noticed this issue during latency profiling related to
Ozone OM locks, where CPU usage for our OM spiked over a period of time. Hence,
I suggested that enabling lock profiling should ideally be an optional feature.
OpenJDK 17 lower version bug: https://bugs.openjdk.org/browse/JDK-8188055.
> Add fine-grained locks metrics in DataSetLockManager
> -----------------------------------------------------
>
> Key: HDFS-17523
> URL: https://issues.apache.org/jira/browse/HDFS-17523
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: lei w
> Priority: Major
> Labels: pull-request-available
>
> Currently we use fine-grained locks to manage FsDataSetImpl. But we did not
> add lock-related metrics. In some cases, we actually need lock-holding
> information to understand the time-consuming lock-holding of a certain
> operation. Using this information, we can also optimize some long-term lock
> operations as early as possible.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]