[
https://issues.apache.org/jira/browse/HDFS-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131492#comment-15131492
]
Jing Zhao commented on HDFS-9754:
---------------------------------
Looking into the current code, we have:
# When added into the blocksMap, a block is always associated with a file
# A reported replica will be invalidated if there is no BlockInfo record in
blocksMap
# A file is removed from inodeMap only when we do
concat/delete/deleteSnapshot/rename/create. During these ops we always use a
map to collect blocks that should be deleted.
# Removing inodes from INodeMap is in the write lock, while removing blocks
from blocksMap is not. Thus when scheduling recovery work we need to do extra
check to see if the block has already been abandoned.
Based on the above observations, looks like when we collect to-be-deleted
blocks, we can set its BlockCollection id ({{bcId}}) to
{{INodeId.INVALID_INODE_ID}}. Then later we can use {{bcId}} to see if it has
already been abandoned. In this way we can avoid some {{getBlockCollection}}
calls triggered from BlockManager to FSNamesystem/FSDirectory.
> Avoid unnecessary getBlockCollection calls in BlockManager
> ----------------------------------------------------------
>
> Key: HDFS-9754
> URL: https://issues.apache.org/jira/browse/HDFS-9754
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Jing Zhao
> Assignee: Jing Zhao
>
> Currently BlockManager calls {{Namesystem#getBlockCollection}} in order to:
> 1. check if the block has already been abandoned
> 2. identify the storage policy of the block
> 3. meta save
> For #1 we can use BlockInfo's internal state instead of checking if the
> corresponding file still exists.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)