NameNode should exclude excessive replicas when counting live replicas for a 
block
----------------------------------------------------------------------------------

                 Key: HADOOP-4643
                 URL: https://issues.apache.org/jira/browse/HADOOP-4643
             Project: Hadoop Core
          Issue Type: Bug
          Components: dfs
            Reporter: Hairong Kuang
            Assignee: Hairong Kuang
             Fix For: 0.18.3


Currently NameNode include excessive replicas in blockMap and count them as 
live replicas. Although excessive replicas have marked as invalid, scheduling 
deletion may be delayed and also datanode does not send deletion confirmation 
until the next block report. As a result, excessive replicas may stay in 
blocksMap for quite a while. This may cause underReplicated blocks undetected 
in NameNode. 

For example, assume that block b is at datanode d1, d2, and d3. We have the 
following scenario
1. d1 loses heartbeat, NN will replicate b to another datanode, assuming it is 
d4. 
2. d1 comes back. NN finds out b is over-replicated therefore choosing one 
replica, assuming d4, as a excessive replica and marking it as invalid. Now b 
has 3 valid replicas d1, d2, d3 and 1 excessive (invalid) replica d4, all in 
blocksMap.
3. d2 loses heartbeat. d2 gets removed from blocksMap. Block b has 2 valid 
replicas d1 and d3, and 1 excessive invalid replica d4. Block b is 
under-replicated; But NN still counts block b has 3 live replicas so does not 
take any action to replicate block b.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to