[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402478#comment-15402478
 ] 

Arpit Agarwal commented on HDFS-10682:
--------------------------------------

Hi [~vagarychen], thanks for taking this up. I recommend splitting the work 
into two parts:
# Refactor the code to synchronize on a new Reentrant lock instead of the 
FsDatasetImpl object. (create a separate Jira for this). The advantage of a 
wrapper object for the lock is callers won't need to add boilerplate code for 
instrumentation. Also we can use try-with-resources instead of having to 
release the lock manually.
# In the second patch we can add instrumentation in just the acquire/close 
methods and expose it as a metric.

> Add metric to measure lock held time in FSDataSetImpl
> -----------------------------------------------------
>
>                 Key: HDFS-10682
>                 URL: https://issues.apache.org/jira/browse/HDFS-10682
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch
>
>
> Add a metric to measure the time the lock of FSDataSetImpl is held by a 
> thread. The goal is to expose this for users to identify operations that 
> locks dataset for long time ("long" in some sense) and be able to 
> understand/reason/track the operation based on logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to