[
https://issues.apache.org/jira/browse/HADOOP-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668214#action_12668214
]
dhruba borthakur commented on HADOOP-5133:
------------------------------------------
> addStoredBlock can not completely ignore it. It should at least update the
> stored block length and add the replica to the blocksMap.
Agreed.
Suppose two datanodes report inconsistent block length in their blockReceived
confirmation of the same block. Suppose both replicas have the same generation
stamp.
1. If the file is not under construction or it is not the last block of a
file then the replica with the smaller size should be treated as corrupt. The
larger size replica should be in the blocksMap.
2. if the file is the last block of a file that is under construction: then
keep the longer size replica in the blocksmap but do not delete the shorter
size replica from the corresponding (i.e. do not treat the shorter size replica
as corrupt). Remove the shorter size replica from the blocks map.
Case1 typically happens when the lazy flush of OS buffers in the datanode
encounters a transient error and one copy of a good replica is truncated on
disk.
Case 2 could occur because a datanode prematurely (because of buggy code
somewhere) sends a block Received to the NN. In this case, it is safe to not
treat the replica as corrupt because the existence of the lease indicates that
the NN does not "own" this block. This situation will be fixed when a block
report is processed after the lease is closed.
> FSNameSystem#addStoredBlock does not handle inconsistent block length
> correctly
> -------------------------------------------------------------------------------
>
> Key: HADOOP-5133
> URL: https://issues.apache.org/jira/browse/HADOOP-5133
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.2
> Reporter: Hairong Kuang
> Fix For: 0.19.1
>
>
> Currently NameNode treats either the new replica or existing replicas as
> corrupt if the new replica's length is inconsistent with NN recorded block
> length. The correct behavior should be
> 1. For a block that is not under construction, the new replica should be
> marked as corrupt if its length is inconsistent (no matter shorter or longer)
> with the NN recorded block length;
> 2. For an under construction block, if the new replica's length is shorter
> than the NN recorded block length, the new replica could be marked as
> corrupt; if the new replica's length is longer, NN should update its recorded
> block length. But it should not mark existing replicas as corrupt. This is
> because NN recorded length for an under construction block does not
> accurately match the block length on datanode disk. NN should not judge an
> under construction replica to be corrupt by looking at the inaccurate
> information: its recorded block length.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.