[
https://issues.apache.org/jira/browse/HDFS-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104830#comment-13104830
]
Benoy Antony commented on HDFS-2290:
------------------------------------
Even if BlockManager.markBlockAsCorrupt () is fixed in the way mentioned
above, there is still a special case in which the NN will not invalidate the
block.
To reproduce this,
set the replication factor for the file to 2.
corrupt the block by removing metafile from one of DNs. The block will not be
removed since number of good replicas is less than the replication factor.
set the replication factor for the file to 1.
The block is still not removed.
This bug is caused by the following statement in BlockManager.invalidateBlock()
// Check how many copies we have of the block. If we have at least one
// copy on a live node, then we can delete it.
int count = countNodes(blk).liveReplicas();
if (count > 1) {
addToInvalidates(blk, dn);
The bug can be fixed by changing the condition to (count >= 1).
> Block with corrupt replica is not getting replicated
> ----------------------------------------------------
>
> Key: HDFS-2290
> URL: https://issues.apache.org/jira/browse/HDFS-2290
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.22.0
> Reporter: Konstantin Shvachko
> Priority: Blocker
> Fix For: 0.22.0
>
>
> A block has one replica marked as corrupt and two good ones. countNodes()
> correctly detects that there are only 2 live replicas, and fsck reports the
> block as under-replicated. But ReplicationMonitor never schedules replication
> of good replicas.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira