[ https://issues.apache.org/jira/browse/HADOOP-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678989#action_12678989 ]
Konstantin Shvachko commented on HADOOP-5133: --------------------------------------------- This patch definitely reduces the probability "committing" an incorrect block. Choosing longer block out of all replicas is better than selecting a shorter one. But this does not answer the question what if (as a result of a software bug or unfortunate circumstance) the longest block is actually a wrong replica. In general the name-node does not have a definite criteria to judge which replica is the right and which is not except for the generation stamp. And it would be wrong to *silently* make such a decision based on the size (or any other artificial convention). I am coming to a conclusion that the honest way to deal with this is to declare all replicas corrupt in this case, that is when this is the last block of a file being written to. This will be reported in fsck and an administrator or the user can deal with it. This should happen only as a result of an error in the code so may be we should just treat it as a corruption. > FSNameSystem#addStoredBlock does not handle inconsistent block length > correctly > ------------------------------------------------------------------------------- > > Key: HADOOP-5133 > URL: https://issues.apache.org/jira/browse/HADOOP-5133 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Affects Versions: 0.18.2 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.20.0 > > Attachments: inconsistentLen.patch, inconsistentLen1.patch, > inconsistentLen2.patch > > > Currently NameNode treats either the new replica or existing replicas as > corrupt if the new replica's length is inconsistent with NN recorded block > length. The correct behavior should be > 1. For a block that is not under construction, the new replica should be > marked as corrupt if its length is inconsistent (no matter shorter or longer) > with the NN recorded block length; > 2. For an under construction block, if the new replica's length is shorter > than the NN recorded block length, the new replica could be marked as > corrupt; if the new replica's length is longer, NN should update its recorded > block length. But it should not mark existing replicas as corrupt. This is > because NN recorded length for an under construction block does not > accurately match the block length on datanode disk. NN should not judge an > under construction replica to be corrupt by looking at the inaccurate > information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.