[jira] Commented: (HADOOP-4643) NameNode should exclude excessive replicas when counting live replicas for a block

Raghu Angadi (JIRA) Thu, 13 Nov 2008 15:09:38 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647453#action_12647453
 ]


Raghu Angadi commented on HADOOP-4643:
--------------------------------------

+1. Patch looks good.

> NameNode should exclude excessive replicas when counting live replicas for a 
> block
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4643
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4643
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.18.3
>
>         Attachments: nodeCount.patch
>
>
> Currently NameNode include excessive replicas in blockMap and count them as 
> live replicas. Although excessive replicas have marked as invalid, scheduling 
> deletion may be delayed and also datanode does not send deletion confirmation 
> until the next block report. As a result, excessive replicas may stay in 
> blocksMap for quite a while. This may cause underReplicated blocks undetected 
> in NameNode. 
> For example, assume that block b is at datanode d1, d2, and d3. We have the 
> following scenario
> 1. d1 loses heartbeat, NN will replicate b to another datanode, assuming it 
> is d4. 
> 2. d1 comes back. NN finds out b is over-replicated therefore choosing one 
> replica, assuming d4, as a excessive replica and marking it as invalid. Now b 
> has 3 valid replicas d1, d2, d3 and 1 excessive (invalid) replica d4, all in 
> blocksMap.
> 3. d2 loses heartbeat. d2 gets removed from blocksMap. Block b has 2 valid 
> replicas d1 and d3, and 1 excessive invalid replica d4. Block b is 
> under-replicated; But NN still counts block b has 3 live replicas so does not 
> take any action to replicate block b.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4643) NameNode should exclude excessive replicas when counting live replicas for a block

Reply via email to