Todd Lipcon created HDFS-3967:
---------------------------------
Summary: NN should bail our earlier when logs to load have a gap
Key: HDFS-3967
URL: https://issues.apache.org/jira/browse/HDFS-3967
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Affects Versions: 2.0.1-alpha, 3.0.0
Reporter: Todd Lipcon
Priority: Minor
i was testing an HA setup with a lowered edit log retention period, and ended
up in a state where one of the two NNs had fallen too far behind, such that it
couldn't start up again (due to the too-low retention period). When I started
the NN, I got the following:
12/09/21 13:03:20 INFO namenode.FSImage: Loaded image for txid 45781083 from
/tmp/name1-name/current/fsimage_0000000000045781083
12/09/21 13:03:20 INFO namenode.FSImage: Reading
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@239a0feb
expecting start txid #45781084
12/09/21 13:03:20 INFO namenode.EditLogInputStream: Fast-forwarding stream
'http://localhost:13081/getJournal?jid=myjournal&segmentTxId=45928954&storageInfo=-40%3A292785232%3A0%3ACID-0553884b-f3ea-46a3-9154-200d4f84304b,
http://localhost:13082/getJournal?jid=myjournal&segmentTxId=45928954&storageInfo=-40%3A292785232%3A0%3ACID-0553884b-f3ea-46a3-9154-200d4f84304b,
http://localhost:13083/getJournal?jid=myjournal&segmentTxId=45928954&storageInfo=-40%3A292785232%3A0%3ACID-0553884b-f3ea-46a3-9154-200d4f84304b'
to transaction ID 45781084
12/09/21 13:03:20 INFO namenode.EditLogInputStream: Fast-forwarding stream
'http://localhost:13081/getJournal?jid=myjournal&segmentTxId=45928954&storageInfo=-40%3A292785232%3A0%3ACID-0553884b-f3ea-46a3-9154-200d4f84304b'
to transaction ID 45781084
12/09/21 13:03:20 FATAL namenode.NameNode: Exception in namenode join
java.io.IOException: There appears to be a gap in the edit log. We expected
txid 45781084, but got txid 45928954.
Rather than trying to 'fast forward' the stream to a transaction which is
actually prior to the first tx, we should bail earlier with a nicer error.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira