[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403061#comment-15403061
 ] 

Arpit Agarwal commented on HDFS-10682:
--------------------------------------

Thanks for the updated patch [~vagarychen]! This is looking good. A couple of 
comments:
# We also need to fix other locations that are synchronizing on the 
FSDatasetImpl object e.g. 
[FsVolumeImpl|https://github.com/apache/hadoop/blob/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java#L307],
 
[DirectoryScanner|https://github.com/apache/hadoop/blob/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java#L586].
# Let's move the instrumentation changes to a separate Jira. We can repurpose 
this just for splitting out the lock. Comments on the instrumentation changes:
## We don't need ThreadLocal or threadID-> timestamps map. We are measuring the 
lock held time so we can save a timestamp just after getting the lock and 
another timestamp just before releasing the lock. Then diff them with the lock 
held and log after releasing the lock. We may need to use a thread local 
approach later if we have a read-write lock in which case there can be multiple 
concurrent lock holders.
## You don't need the {{if (start == 0 || start2 == 0)}} checks. These values 
they can be assumed to be correct now they are initialized in the lock class.


> Replace FsDatasetImpl object lock with a separate lock object
> -------------------------------------------------------------
>
>                 Key: HDFS-10682
>                 URL: https://issues.apache.org/jira/browse/HDFS-10682
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch
>
>
> Add a metric to measure the time the lock of FSDataSetImpl is held by a 
> thread. The goal is to expose this for users to identify operations that 
> locks dataset for long time ("long" in some sense) and be able to 
> understand/reason/track the operation based on logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to