[ https://issues.apache.org/jira/browse/HADOOP-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469877 ]
dhruba borthakur commented on HADOOP-940: ----------------------------------------- if a block is in pendingReplication, then it means that this block needs more copies. It does not say anythung about how many more copies are to be made. When a heartbeat arirves at the namenode, the heartbeat processing code picks up a block from neededReplication, determines how many new replicas to be made and then tells the datanode to start replication. As part of hadoop-923, I am changing thet fact that heartbeat processing is doing all the heavy lifting. A new thread inside FSNamesystem will periodically pick up blocks from neededReplication and compute targets of new replicas. But I agree that pendingReplication queue shoudl be periodically scanned and old items be dealt with appropriately. > pendingReplications of FSNamesystem is not informative > ------------------------------------------------------ > > Key: HADOOP-940 > URL: https://issues.apache.org/jira/browse/HADOOP-940 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.10.1 > Reporter: Hairong Kuang > Fix For: 0.11.0 > > > Currently when a neededReplication block is scheduled to be replicated, it is > put to the pendingReplications queue. When it is no longer under replicated, > it is pulled out of the pendingReplications queue. But the queue does not > provide any information like how many targets have been choosen or who those > targets are. PendingReplications are not used when deciding if a block is > under replication. This may cause a block to be over replications or > inaccurate estimate of its replication priority. > For example, when a block has 1 replicas but it's replication factor is 2, a > data node is choosen to replicate this block and the block is put in the > pendingReplications queue. If the block's replication factor is changed to be > 3 before the block replication notification, which is the next block report, > comes in, the block will be put into neededReplictions queue again under the > assumption that it needs to choose 2 targets instead of 1. So the block will > end up with 4 replicas. > I propose that we change pendingReplications to be a map from a block to the > choosen data nodes. Data nodes in both pendingReplications and blockMap are > used when deciding the total number of replicas that a block has. When the > name node is notified that the block is replicated in a choosen data node, > the data node is moved from pendingReplications to blockMap. > Each choosen target is also associated with a timer indicating how long it > expects to receive the block replication notification. PendingReplications > queue needs to be periodically scanned to remove those data nodes whose timer > is expired. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.