[ https://issues.apache.org/jira/browse/HDFS-15901 ]
Yanlei Yu deleted comment on HDFS-15901:
----------------------------------
was (Author: JIRAUSER294151):
This seems to be a bug, we also encountered a similar error, after restarting
the namenode, we found that the datanode FBR in the namenode log, some disk
block report could not be reported successfully because of invalid ticket,
because it was considered as a second report. After processReport method called
processFirstBlockReport storageInfo. ReceivedBlockReport ();
blockReportCount++; , processFirstBlockReport processing is sent to the queue
is not actual processing quick report, then it is likely that they will appear
error, lead to the first piece of report by mistake for the second time, then
will enter storageInfo. GetBlockReportCount () > 0, Then
blockReportLeaseManager. RemoveLease (node); , causing block reports to be
rejected for subsequent renewals on the datanode
> Solve the problem of DN repeated block reports occupying too many RPCs during
> Safemode
> --------------------------------------------------------------------------------------
>
> Key: HDFS-15901
> URL: https://issues.apache.org/jira/browse/HDFS-15901
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: JiangHua Zhu
> Assignee: JiangHua Zhu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When the cluster exceeds thousands of nodes, we want to restart the NameNode
> service, and all DataNodes send a full Block action to the NameNode. During
> SafeMode, some DataNodes may send blocks to NameNode multiple times, which
> will take up too much RPC. In fact, this is unnecessary.
> In this case, some block report leases will fail or time out, and in extreme
> cases, the NameNode will always stay in Safe Mode.
> 2021-03-14 08:16:25,873 [78438700] - INFO [Block report
> processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded
> non-initial block report from DatanodeRegistration(xxxxxxxx:port,
> datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx,
> ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode
> still in startup phase
> 2021-03-14 08:16:31,521 [78444348] - INFO [Block report
> processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded
> non-initial block report from DatanodeRegistration(xxxxxxxx,
> datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx,
> ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode
> still in startup phase
> 2021-03-13 18:35:38,200 [29191027] - WARN [Block report
> processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for
> DN xxxxxxxx, because the DN is not in the pending set.
> 2021-03-13 18:36:08,143 [29220970] - WARN [Block report
> processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for
> DN xxxxxxxx, because the DN is not in the pending set.
> 2021-03-13 18:36:08,143 [29220970] - WARN [Block report
> processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for
> DN xxxxxxxx, because the lease has expired.
> 2021-03-13 18:36:08,145 [29220972] - WARN [Block report
> processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for
> DN xxxxxxxx, because the lease has expired.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]