[ https://issues.apache.org/jira/browse/HDFS-15562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aihua Xu updated HDFS-15562: ---------------------------- Status: Patch Available (was: Open) patch-1: Standby NameNode will do the checkpoint and upload the image to active and Observer NameNodes. Currently if any remote NameNode is down and uploads fails, then standby NameNode will immediately do the checkpoint again and try uploading. With multiple Observer NameNodes, it's not required that all the Observers are running. The patch will throw exception for checkpoint itself but not for upload failures. > StandbyCheckpointer will do checkpoint repeatedly while connecting > observer/active namenode failed > -------------------------------------------------------------------------------------------------- > > Key: HDFS-15562 > URL: https://issues.apache.org/jira/browse/HDFS-15562 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: SunHao > Assignee: Aihua Xu > Priority: Major > Attachments: HDFS-15562.patch > > > We find the standby namenode will do checkpoint over and over while > connecting observer/active namenode failed. > StandbyCheckpointer won't update “lastCheckpointTime” when upload new fsimage > to the other namenode failed, so that the standby namenode will keep doing > checkpoint repeatedly. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org