[ 
https://issues.apache.org/jira/browse/HADOOP-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469923
 ] 

Hairong Kuang commented on HADOOP-940:
--------------------------------------

If a block is in neededReplication, it means that it needs more copies and 
needs to run chooseTarget. If a block is in pendingReplication, it means that 
chooseTarget has already been run and a Replication command has been send to 
datanodes. 

> pendingReplications of FSNamesystem is not informative
> ------------------------------------------------------
>
>                 Key: HADOOP-940
>                 URL: https://issues.apache.org/jira/browse/HADOOP-940
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.10.1
>            Reporter: Hairong Kuang
>
> Currently when a neededReplication block is scheduled to be replicated, it is 
> put to the pendingReplications queue. When it is no longer under replicated, 
> it is pulled out of the pendingReplications queue. But the queue does not 
> provide any information like how many targets have been choosen or who those 
> targets are. PendingReplications are not used when deciding if a block is 
> under replication. This may cause a block to be over replications or 
> inaccurate estimate of its replication priority.
> For example, when a block has 1 replicas but it's replication factor is 2, a 
> data node is choosen to replicate this block and the block is put in the 
> pendingReplications queue. If the block's replication factor is changed to be 
> 3 before the block replication notification, which is the next block report, 
> comes in, the block will be put into neededReplictions queue again under the 
> assumption that it needs to choose 2 targets instead of 1. So the block will 
> end up with 4 replicas.
> I propose that we change pendingReplications to be a map from a block to the 
> choosen data nodes. Data nodes in both pendingReplications and blockMap are 
> used when deciding the total number of replicas that a block has. When the 
> name node is notified that the block is replicated in a choosen data node, 
> the data node is moved from pendingReplications to blockMap.
> Each choosen target is also associated with a timer indicating how long it 
> expects to receive the block replication notification. PendingReplications 
> queue needs to be periodically scanned to remove those data nodes whose timer 
> is expired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to