[ 
https://issues.apache.org/jira/browse/HDFS-14576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865718#comment-16865718
 ] 

He Xiaoqiao edited comment on HDFS-14576 at 6/17/19 3:56 PM:
-------------------------------------------------------------

Thanks [~jojochuang],[~dineshchitlangia] for your quick response.
{quote}I suppose this can still be an issue at 10k node scale.{quote}
Correct, I think it is more serious for larger cluster according to my 
experience. I will submit draft patch later and welcome discussion.
About load fsimage, some solutions has been proposed, for instance HDFS-7784 
loading in parallel. I think it is time to reevaluate this solution. FYI.


was (Author: hexiaoqiao):
Thanks [~jojochuang],[~dineshchitlangia] for your quick response.
{quote}I suppose this can still be an issue at 10k node scale.{quote}
Correct, I think it is more serious for larger cluster according to my 
experience. I will submit draft patch later and welcome discussion.
About load fsimage, there are some solution proposed, for instance HDFS-7784 
loading in parallel. I think it is time to reevaluate this solution. FYI.

> Avoid block report retry and slow down namenode startup
> -------------------------------------------------------
>
>                 Key: HDFS-14576
>                 URL: https://issues.apache.org/jira/browse/HDFS-14576
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
>
> During namenode startup, the load will be very high since it has to process 
> every datanodes blockreport one by one. If there are hundreds datanodes block 
> reports pending process, the issue will be more serious even 
> #processFirstBlockReport is processed a lot more efficiently than ordinary 
> block reports. Then some of datanode will retry blockreport and lengthens 
> restart times. I think we should filter the block report request (via 
> datanode blockreport retries) which has be processed and return directly then 
> shorten down restart time. I want to state this proposal may be obvious only 
> for large cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to