[
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381848#comment-14381848
]
Tsz Wo Nicholas Sze commented on HDFS-7980:
-------------------------------------------
Missing delHint seems a minor issue.
For reporting RBW after FINALIZED, we have to take it anyway (and the code is
already taking care it) since full block report could possibly be delayed as
described in the code:
{code}
//BlockManager.checkReplicaCorrupt(..)
if (reportedState == ReplicaState.RBW) {
// If it's a RBW report for a COMPLETE block, it may just be that
// the block report got a little bit delayed after the pipeline
// closed. So, ignore this report, assuming we will get a
// FINALIZED replica later. See HDFS-2791
LOG.info("Received an RBW replica for " + storedBlock +
" on " + dn + ": ignoring it, since it is " +
"complete with the same genstamp");
return null;
} else ...
{code}
As a conclusion, NN could safely ignore the incremental when
storageInfo.numBlocks() == 0. What do you think?
> Incremental BlockReport will dramatically slow down the startup of a namenode
> ------------------------------------------------------------------------------
>
> Key: HDFS-7980
> URL: https://issues.apache.org/jira/browse/HDFS-7980
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Hui Zheng
> Assignee: Walter Su
> Attachments: HDFS-7980.001.patch
>
>
> In the current implementation the datanode will call the
> reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before
> calling the bpNamenode.blockReport() method. So in a large(several thousands
> of datanodes) and busy cluster it will slow down(more than one hour) the
> startup of namenode.
> {code}
> List<DatanodeCommand> blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = now();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
> return null;
> }
> final ArrayList<DatanodeCommand> cmds = new ArrayList<DatanodeCommand>();
> // Flush any block information that precedes the block report. Otherwise
> // we have a chance that we will miss the delHint information
> // or we will report an RBW replica after the BlockReport already reports
> // a FINALIZED one.
> reportReceivedDeletedBlocks();
> lastDeletedReport = startTime;
> .........
> // Send the reports to the NN.
> int numReportsSent = 0;
> int numRPCs = 0;
> boolean success = false;
> long brSendStartTime = now();
> try {
> if (totalBlockCount < dnConf.blockReportSplitThreshold) {
> // Below split threshold, send all reports in a single message.
> DatanodeCommand cmd = bpNamenode.blockReport(
> bpRegistration, bpos.getBlockPoolId(), reports);
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)