[jira] [Commented] (HDFS-16785) DataNode hold BP write lock to scan disk

ASF GitHub Bot (Jira) Thu, 29 Sep 2022 04:27:04 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-16785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610966#comment-17610966
 ]


ASF GitHub Bot commented on HDFS-16785:
---------------------------------------

ZanderXu commented on PR #4945:
URL: https://github.com/apache/hadoop/pull/4945#issuecomment-1262143802

   ```
   try (AutoCloseDataSetLock l = lockManager.readLock(LockLevel.VOLUME, bpid, 
fsVolume.getStorageID())) {
           fsVolume.addBlockPool(bpid, this.conf, this.timer);
           fsVolume.getVolumeMap(bpid, tempVolumeMap, ramDiskReplicaTracker);
         } catch (IOException e) {
           LOG.warn("Caught exception when adding " + fsVolume +
               ". Will throw later.", e);
           exceptions.add(e);
         }
   ```
   Changing the code as above?  Emm.. Holding the BP read lock for a long time 
will have a great impact on the operations that need to acquire the BP write 
lock, such as: invalidate, recoverAppend, createTemporary.
   
   The current logic uses IOException to avoid the conflict case, I think it's 
ok. And there is no lock before HDFS-15382, means it's ok. If we can find one 
conflict case, we can use IOException to fix it.
   




> DataNode hold BP write lock to scan disk
> ----------------------------------------
>
>                 Key: HDFS-16785
>                 URL: https://issues.apache.org/jira/browse/HDFS-16785
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>              Labels: pull-request-available
>
> When patching the fine-grained locking of datanode, I  found that `addVolume` 
> will hold the write block of the BP lock to scan the new volume to get the 
> blocks. If we try to add one full volume that was fixed offline before, i 
> will hold the write lock for a long time.
> The related code as bellows:
> {code:java}
> for (final NamespaceInfo nsInfo : nsInfos) {
>   String bpid = nsInfo.getBlockPoolID();
>   try (AutoCloseDataSetLock l = lockManager.writeLock(LockLevel.BLOCK_POOl, 
> bpid)) {
>     fsVolume.addBlockPool(bpid, this.conf, this.timer);
>     fsVolume.getVolumeMap(bpid, tempVolumeMap, ramDiskReplicaTracker);
>   } catch (IOException e) {
>     LOG.warn("Caught exception when adding " + fsVolume +
>         ". Will throw later.", e);
>     exceptions.add(e);
>   }
> } {code}
> And I noticed that this lock is added by HDFS-15382, means that this logic is 
> not in lock before. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16785) DataNode hold BP write lock to scan disk

Reply via email to