[
https://issues.apache.org/jira/browse/HDFS-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508495#comment-13508495
]
Aaron T. Myers commented on HDFS-4238:
--------------------------------------
Aha, yep, got it. Thanks a lot for the explanation. I agree with your analysis.
> [HA] Standby namenode should not do purging of shared storage edits.
> --------------------------------------------------------------------
>
> Key: HDFS-4238
> URL: https://issues.apache.org/jira/browse/HDFS-4238
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ha
> Affects Versions: 3.0.0, 2.0.2-alpha
> Reporter: Vinay
>
> This happened in our cluster,
> >> Standby NN was keep doing checkpoint every one hour and uploading to
> >> Active NN was continuously failing due to some kerberos issue and nobody
> >> noticed this, since Active was servicing properly.
> >> Active NN was up for long time with fsimage having very least transaction.
> >> Standby NN has saved the checkpoint in its name dir and purged the txns >
> >> 1000000 from shared storage ( includes edits which are not present in
> >> Active NN's fsimage)
> >> After some time Active NN is restarted and StandBy NN switched to Active.
> Now current Standby not able to load any edits from shared storage, as
> expected edits are not present in shared storage. Its keep running idle.
> So {{editLog.purgeLogsOlderThan(purgeLogsFrom);}} always should be called
> from Active NameNode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira