[
https://issues.apache.org/jira/browse/HADOOP-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647453#action_12647453
]
Raghu Angadi commented on HADOOP-4643:
--------------------------------------
+1. Patch looks good.
> NameNode should exclude excessive replicas when counting live replicas for a
> block
> ----------------------------------------------------------------------------------
>
> Key: HADOOP-4643
> URL: https://issues.apache.org/jira/browse/HADOOP-4643
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.18.3
>
> Attachments: nodeCount.patch
>
>
> Currently NameNode include excessive replicas in blockMap and count them as
> live replicas. Although excessive replicas have marked as invalid, scheduling
> deletion may be delayed and also datanode does not send deletion confirmation
> until the next block report. As a result, excessive replicas may stay in
> blocksMap for quite a while. This may cause underReplicated blocks undetected
> in NameNode.
> For example, assume that block b is at datanode d1, d2, and d3. We have the
> following scenario
> 1. d1 loses heartbeat, NN will replicate b to another datanode, assuming it
> is d4.
> 2. d1 comes back. NN finds out b is over-replicated therefore choosing one
> replica, assuming d4, as a excessive replica and marking it as invalid. Now b
> has 3 valid replicas d1, d2, d3 and 1 excessive (invalid) replica d4, all in
> blocksMap.
> 3. d2 loses heartbeat. d2 gets removed from blocksMap. Block b has 2 valid
> replicas d1 and d3, and 1 excessive invalid replica d4. Block b is
> under-replicated; But NN still counts block b has 3 live replicas so does not
> take any action to replicate block b.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.