[jira] [Comment Edited] (HDFS-14576) Avoid block report retry and slow down namenode startup

He Xiaoqiao (JIRA) Thu, 18 Jul 2019 21:34:45 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888159#comment-16888159
 ]


He Xiaoqiao edited comment on HDFS-14576 at 7/19/19 4:33 AM:
-------------------------------------------------------------

Thanks [~zhangchen] for your furthermore discussion.
I agree that lifeline and blockreport lease is both good solution, and it can 
resolve part issue for namenode startup very well. But not solved completely.
For safemode auto leave mechanism issue, based on blocks number rather than 
replications number currently, it could meet some unexpected exception. another 
case, after auto leave safemode when namenode startup then switch to active 
state, It will trigger block replication since block replicas not reaches it 
needs, when all replications report later, then remove redundant replicas. Thus 
data flood will occupy bandwidth where unnecessary.
Thanks mentioned HDFS-14657, it is very interesting thought. I would like to 
watch closely. Thanks again.


was (Author: hexiaoqiao):
Thanks [~zhangchen] for your furthermore discussion.
I agree that lifeline and blockreport lease is both good solution, and it can 
resolve part issue for namenode startup very well. But not solved completely.
For safemode auto leave mechanism issue, based on blocks number rather than 
replications number currently, it could meet some unexpected exception. another 
case, after auto leave safemode when namenode startup then switch to active 
state, It will trigger block replication since block replicas not reaches it 
needs, when all replications report later, then remove redundant replicas. Thus 
data flood will occupy bandwidth where unnecessary.
Thanks mentioned HDFS-14757, it is very interesting thought. I would like to 
watch closely. Thanks again.

> Avoid block report retry and slow down namenode startup
> -------------------------------------------------------
>
>                 Key: HDFS-14576
>                 URL: https://issues.apache.org/jira/browse/HDFS-14576
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
>
> During namenode startup, the load will be very high since it has to process 
> every datanodes blockreport one by one. If there are hundreds datanodes block 
> reports pending process, the issue will be more serious even 
> #processFirstBlockReport is processed a lot more efficiently than ordinary 
> block reports. Then some of datanode will retry blockreport and lengthens 
> restart times. I think we should filter the block report request (via 
> datanode blockreport retries) which has be processed and return directly then 
> shorten down restart time. I want to state this proposal may be obvious only 
> for large cluster.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-14576) Avoid block report retry and slow down namenode startup

Reply via email to