Vinay created HDFS-4238:
---------------------------

             Summary: [HA] Standby namenode should not do purging of shared 
storage edits.
                 Key: HDFS-4238
                 URL: https://issues.apache.org/jira/browse/HDFS-4238
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: ha
    Affects Versions: 2.0.2-alpha, 3.0.0
            Reporter: Vinay


This happened in our cluster,

>> Standby NN was keep doing checkpoint every one hour and uploading to Active 
>> NN was continuously failing due to some kerberos issue and nobody noticed 
>> this, since Active was servicing properly.

>> Active NN was up for long time with fsimage having very least transaction.

>> Standby NN has saved the checkpoint in its name dir and purged the txns > 
>> 1000000 from shared storage ( includes edits which are not present in Active 
>> NN's fsimage)

>> After some time Active NN is restarted and StandBy NN switched to Active.

Now current Standby not able to load any edits from shared storage, as expected 
edits are not present in shared storage. Its keep running idle.


So {{editLog.purgeLogsOlderThan(purgeLogsFrom);}} always should be called from 
Active NameNode.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to