[ https://issues.apache.org/jira/browse/HADOOP-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033619#comment-13033619 ]
Eli Collins commented on HADOOP-1117: ------------------------------------- Doesn't this break the rack policy? We should only remove a block from neededReplications when it has sufficient replicas both in terms of replica count and rack count. {noformat} + if (numCurrentReplica >= fileReplication) { + neededReplications.remove(block); + } else {noformat} > DFS Scalability: When the namenode is restarted it consumes 80% CPU > ------------------------------------------------------------------- > > Key: HADOOP-1117 > URL: https://issues.apache.org/jira/browse/HADOOP-1117 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Priority: Blocker > Fix For: 0.12.1 > > Attachments: CpuPendingTransfer3.patch > > > When the namenode is restarted, the datanodes register and each block is > inserted into neededReplication. When the namenode exists, safemode it sees > starts processing neededReplication. It picks up a block from > neededReplication, sees that it has already has the required number of > replicas, and continues to the next block in neededReplication. The blocks > remain in neededReplication permanentlyhe namenode worker thread to scans > this huge list of blocks once every 3 seconds. This consumes plenty of CPU on > the namenode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira