[
https://issues.apache.org/jira/browse/HDFS-10214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259655#comment-15259655
]
Brahma Reddy Battula commented on HDFS-10214:
---------------------------------------------
linked to duplicate jira ( HDFS-7097)..
> Checkpoint Can not be done by StandbyNameNode.Because checkpoint may cause
> DataNode blockReport.blockReceivedAndDeleted.heartbeat rpc timeout when the
> object num > 100000000.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-10214
> URL: https://issues.apache.org/jira/browse/HDFS-10214
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: ha, namenode
> Affects Versions: 2.5.0, 2.6.4
> Environment: 500 DataNode.
> 137407265 files and directories, 129614074 blocks = 267021339 total
> filesystem object(s)
> Reporter: ChenFolin
> Fix For: 2.7.2
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> The current Cluster status :
> 137407265 files and directories, 129614074 blocks = 267021339 total
> filesystem object(s).
> The checkpoint save namespace cost more than 5 min.
> DataNode rpc timeout.
> Standby NameNode skip the DataNode rpc request(because datanode rpc timeout ,
> datanode close the socket channel).
> There are many corrupt files when failover.
> So, Checkpoint may be done by other component, not Standby NameNode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)