[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

Stephen O'Donnell (Jira) Fri, 13 Mar 2020 09:37:01 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058908#comment-17058908
 ]


Stephen O'Donnell commented on HDFS-15160:
------------------------------------------

Thanks for the review [~arp].

By {{validateBlockFile}} I assume you mean this change:

{code}
@@ -2041,9 +2041,7 @@ private boolean isValid(final ExtendedBlock b, final 
ReplicaState state) {
   ReplicaInfo validateBlockFile(String bpid, long blockId) {
     //Should we check for metadata file too?
     final ReplicaInfo r;
-    try (AutoCloseableLock lock = datasetWriteLock.acquire()) {
-      r = volumeMap.get(bpid, blockId);
-    }
+    r = volumeMap.get(bpid, blockId);
     if (r != null) {
       if (r.blockDataExists()) {
         return r;
{code}

Inside volumeMap.get(), it acquires the same lock (VolumeMap is initialized 
with the same lock instance as FsDatasetImpl), so it is getting the same lock 
twice. I doubt this hurts performance, but figured it made sense to just remove 
the outer locking in this case. This is the code for volumeMap.get()

{code}
  ReplicaInfo get(String bpid, long blockId) {
    checkBlockPool(bpid);
    try (AutoCloseableLock l = readLock.acquire()) {
      FoldedTreeSet<ReplicaInfo> set = map.get(bpid);
      if (set == null) {
        return null;
      }
      return set.get(blockId, LONG_AND_BLOCK_COMPARATOR);
    }
  }
{code}

Acquiring and releasing volume references outside of the lock should be OK. 
Each volume reference seems to be a counter backed by an AtomicInteger, so each 
reference reference does its own synchronization when incrementing and 
decrementing the counter.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15160
>                 URL: https://issues.apache.org/jira/browse/HDFS-15160
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

Reply via email to