[
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402478#comment-15402478
]
Arpit Agarwal commented on HDFS-10682:
--------------------------------------
Hi [~vagarychen], thanks for taking this up. I recommend splitting the work
into two parts:
# Refactor the code to synchronize on a new Reentrant lock instead of the
FsDatasetImpl object. (create a separate Jira for this). The advantage of a
wrapper object for the lock is callers won't need to add boilerplate code for
instrumentation. Also we can use try-with-resources instead of having to
release the lock manually.
# In the second patch we can add instrumentation in just the acquire/close
methods and expose it as a metric.
> Add metric to measure lock held time in FSDataSetImpl
> -----------------------------------------------------
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: Chen Liang
> Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch,
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch
>
>
> Add a metric to measure the time the lock of FSDataSetImpl is held by a
> thread. The goal is to expose this for users to identify operations that
> locks dataset for long time ("long" in some sense) and be able to
> understand/reason/track the operation based on logs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]