[ https://issues.apache.org/jira/browse/HDFS-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866474#action_12866474 ]
dhruba borthakur commented on HDFS-1147: ---------------------------------------- One proposal that I have is that the NN can short-circuit the processing time of a blockreport if it knows that this is the very first block report from a datanode. This is the typical case when NN restarts. My experiments indicate that the more than 50% of time is spent in generating the diff (node.reportDiff()) between the incoming block report and what is in the blocksMap. We can reduce this 50% overhead if the NN knows that this is the first ever block report from that datanode: instead of producing a diff, it can directly invoke addStoredBlock() on all the blocks in the incoming block report. This will effectively delay the deletion of blocks that do not belong to the namespace until the next block report, but that might be an acceptable tradeoff. If somebody can explain how this approach can be made to work in the presence of corrupt replicas, that will be great. > Reduce NN startup time by reducing the processing time of block reports > ----------------------------------------------------------------------- > > Key: HDFS-1147 > URL: https://issues.apache.org/jira/browse/HDFS-1147 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: dhruba borthakur > > The NameNode restart times are impacted to a large extent by the processing > time of block reports. For a cluster with 150 millions blocks, the block > report processing in the NN can take upto 20 minutes or so. The NN is open > for business only after it has processed the block reports from most of the > datanodes. If we can reduce the processing time of a block report, that will > directly reduce the restart time of the NN. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.