ChenFolin created HDFS-10214: -------------------------------- Summary: Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the object num > 100000000. Key: HDFS-10214 URL: https://issues.apache.org/jira/browse/HDFS-10214 Project: Hadoop HDFS Issue Type: New Feature Components: ha, namenode Affects Versions: 2.6.4, 2.5.0 Environment: 500 DataNode.
137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s) Reporter: ChenFolin The current Cluster status : 137407265 files and directories, 129614074 blocks = 267021339 total filesystem object(s). The checkpoint save namespace cost more than 5 min. DataNode rpc timeout. Standby NameNode skip the DataNode rpc request(because datanode rpc timeout , datanode close the socket channel). There are many corrupt files when failover. So, Checkpoint may be done by other component, not Standby NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)