NameNode should exclude excessive replicas when counting live replicas for a
block
----------------------------------------------------------------------------------
Key: HADOOP-4643
URL: https://issues.apache.org/jira/browse/HADOOP-4643
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Fix For: 0.18.3
Currently NameNode include excessive replicas in blockMap and count them as
live replicas. Although excessive replicas have marked as invalid, scheduling
deletion may be delayed and also datanode does not send deletion confirmation
until the next block report. As a result, excessive replicas may stay in
blocksMap for quite a while. This may cause underReplicated blocks undetected
in NameNode.
For example, assume that block b is at datanode d1, d2, and d3. We have the
following scenario
1. d1 loses heartbeat, NN will replicate b to another datanode, assuming it is
d4.
2. d1 comes back. NN finds out b is over-replicated therefore choosing one
replica, assuming d4, as a excessive replica and marking it as invalid. Now b
has 3 valid replicas d1, d2, d3 and 1 excessive (invalid) replica d4, all in
blocksMap.
3. d2 loses heartbeat. d2 gets removed from blocksMap. Block b has 2 valid
replicas d1 and d3, and 1 excessive invalid replica d4. Block b is
under-replicated; But NN still counts block b has 3 live replicas so does not
take any action to replicate block b.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.