slfan1989 commented on code in PR #6890:
URL: https://github.com/apache/hadoop/pull/6890#discussion_r1669463377
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/AutoCloseDataSetLock.java:
##########
@@ -30,17 +33,39 @@
* use a try-with-resource syntax.
*/
public class AutoCloseDataSetLock extends AutoCloseableLock {
- private Lock lock;
+
+ private final Lock lock;
+
+ private final DataNodeLockManager<AutoCloseDataSetLock> dataNodeLockManager;
+
+ private final boolean isReadLock;
+
+ private static ThreadLocal<DataSetLockHeldInfo> heldInfoThreadLocal = new
Review Comment:
Sorry for the delayed response. I've been quite busy recently.
Here's the detailed situation:
In certain versions of JDK 17, there is an issue where weak references to
ThreadLocal cannot be reclaimed(There is a problem with G1, but CMS can reclaim
memory normally). This results in ThreadLocal objects accumulating a large
amount of data, causing significant CPU overhead during traversal and element
retrieval. We initially noticed this issue during latency profiling related to
Ozone OM locks, where CPU usage for our OM spiked over a period of time. Hence,
I suggested that enabling lock profiling should ideally be an optional feature.
OpenJDK 17 lower version bug: https://bugs.openjdk.org/browse/JDK-8188055.
We need to make sure that the JDK17 version we choose has the above patch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]