[ https://issues.apache.org/jira/browse/HDFS-10214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ChenFolin resolved HDFS-10214. ------------------------------ Resolution: Duplicate Fix Version/s: 2.7.2 > Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause > DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the > object num > 100000000. > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: HDFS-10214 > URL: https://issues.apache.org/jira/browse/HDFS-10214 > Project: Hadoop HDFS > Issue Type: New Feature > Components: ha, namenode > Affects Versions: 2.5.0, 2.6.4 > Environment: 500 DataNode. > 137407265 files and directories, 129614074 blocks = 267021339 total > filesystem object(s) > Reporter: ChenFolin > Fix For: 2.7.2 > > Original Estimate: 672h > Remaining Estimate: 672h > > The current Cluster status : > 137407265 files and directories, 129614074 blocks = 267021339 total > filesystem object(s). > The checkpoint save namespace cost more than 5 min. > DataNode rpc timeout. > Standby NameNode skip the DataNode rpc request(because datanode rpc timeout , > datanode close the socket channel). > There are many corrupt files when failover. > So, Checkpoint may be done by other component, not Standby NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)