[
https://issues.apache.org/jira/browse/HDFS-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739367#action_12739367
]
dhruba borthakur commented on HDFS-512:
---------------------------------------
There are some advantages to using the generation stamp as part of the unique
identifier for a Block object. This ensures that all code correctly identifies
that blocks with different generation stamp are different blocks and can have
different contents inside them. It might not be a big deal for NN data
structures, especially because the NN first checks to see if a block belongs to
a file before inserting it into the BlocksMap. But for external tools that use
a block interface (e.g. Balancer, fsck, etc), it might be helpful for them to
understand that blocks with different generation stamps are different blocks
(do these utilities use the Block object at all?)
@Raghu: > This is probably a good time to add Block to ReplicaInfo.
If we follow Raghu's suggestion, then can we continue using the genstamp as
part of the Block key?
There are other cases, (especially during block report processing) where we
would have to do wild-card lookups for a block. But the cost of these extra
lookup calls might be minimal because they will be in the error-code-path only.
> Set block id as the key to Block
> --------------------------------
>
> Key: HDFS-512
> URL: https://issues.apache.org/jira/browse/HDFS-512
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: Append Branch
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: Append Branch
>
> Attachments: blockKey.patch
>
>
> Currently the key to Block is block id + generation stamp. I would propose to
> change it to be only block id. This is based on the following properties of
> the dfs cluster:
> 1. On each datanode only one replica of block exists. Therefore there is only
> one generation of a block.
> 2. NameNode has only one entry for a block in its blocks map.
> With this change, search for a block/replica's meta information is easier
> since most of the time we know a block's id but may not know its generation
> stamp.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.