[ 
https://issues.apache.org/jira/browse/HDFS-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131492#comment-15131492
 ] 

Jing Zhao commented on HDFS-9754:
---------------------------------

Looking into the current code, we have:
# When added into the blocksMap, a block is always associated with a file
# A reported replica will be invalidated if there is no BlockInfo record in 
blocksMap
# A file is removed from inodeMap only when we do 
concat/delete/deleteSnapshot/rename/create. During these ops we always use a 
map to collect blocks that should be deleted.
# Removing inodes from INodeMap is in the write lock, while removing blocks 
from blocksMap is not. Thus when scheduling recovery work we need to do extra 
check to see if the block has already been abandoned.

Based on the above observations, looks like when we collect to-be-deleted 
blocks, we can set its BlockCollection id ({{bcId}}) to 
{{INodeId.INVALID_INODE_ID}}. Then later we can use {{bcId}} to see if it has 
already been abandoned. In this way we can avoid some {{getBlockCollection}} 
calls triggered from BlockManager to FSNamesystem/FSDirectory.


> Avoid unnecessary getBlockCollection calls in BlockManager
> ----------------------------------------------------------
>
>                 Key: HDFS-9754
>                 URL: https://issues.apache.org/jira/browse/HDFS-9754
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>
> Currently BlockManager calls {{Namesystem#getBlockCollection}} in order to:
> 1. check if the block has already been abandoned
> 2. identify the storage policy of the block
> 3. meta save
> For #1 we can use BlockInfo's internal state instead of checking if the 
> corresponding file still exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to