He Xiaoqiao created HDFS-14559:
----------------------------------
Summary: Optimizing safemode leave mechanism
Key: HDFS-14559
URL: https://issues.apache.org/jira/browse/HDFS-14559
Project: Hadoop HDFS
Issue Type: Sub-task
Components: namenode
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao
As HDFS-14186 mentioned, The last stage of namenode startup, it will leave
safemode based on the condition that if blocks num reach to threshold. However
the current condition is complete based on total blocks rather than total
replications. So for a large cluster, after total blocks has reported from
datanode, there are still large block replication pending report and load of
namenode is continue high for long times. In some extreme case, between leave
safemode time and process block report completely, namenode will not provide
normal service and some datanodes could dead then register/blockreport again
and again.
In one word, we need to upgrade safemode leave mechanism to support large
cluster restart smooth.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]