[
https://issues.apache.org/jira/browse/HDFS-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739493#comment-16739493
]
Kihwal Lee commented on HDFS-14186:
-----------------------------------
I think the safe mode extension was intended for this. Instead of setting a
fixed amount of time, we could incorporate your idea into it.
Back when we were running hdfs clusters with 100-150M blocks, a 10-30 second
extension was enough to absorb the end of a start-up full block report stream.
I guess it was a quick and dirty way to do it.
> blockreport storm slow down namenode restart seriously in large cluster
> -----------------------------------------------------------------------
>
> Key: HDFS-14186
> URL: https://issues.apache.org/jira/browse/HDFS-14186
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: He Xiaoqiao
> Assignee: He Xiaoqiao
> Priority: Major
> Attachments: HDFS-14186.001.patch
>
>
> In the current implementation, the datanode sends blockreport immediately
> after register to namenode successfully when restart, and the blockreport
> storm will make namenode high load to process them. One result is some
> received RPC have to skip because queue time is timeout. If some datanodes'
> heartbeat RPC are continually skipped for long times (default is
> heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to
> re-register and send blockreport again, aggravate blockreport storm and trap
> in a vicious circle, and slow down (more than one hour and even more)
> namenode startup seriously in a large (several thousands of datanodes) and
> busy cluster especially. Although there are many work to optimize namenode
> startup, the issue still exists.
> I propose to postpone dead datanode check when namenode have finished startup.
> Any comments and suggestions are welcome.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]