Karthik Palanisamy created HDFS-16950:
-----------------------------------------
Summary: Gap in edits after -initializeSharedEdits
Key: HDFS-16950
URL: https://issues.apache.org/jira/browse/HDFS-16950
Project: Hadoop HDFS
Issue Type: Bug
Components: journal-node, namenode
Reporter: Karthik Palanisamy
Namenode failed in the production cluster when JN role is migrated.
{code:java}
2023-03-15 00:27:11,173 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode:
Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected
txid xxxxxx, but got txid xxxxxx. {code}
InitializeSharedEdits issued as part of the role migration step. Note, no
checkpoint is performed in the past few hours.
InitializeSharedEdits created a new log segment from the edit_inprogres
transaction and deleted all old transactions.
My ask here is to delete any edit transaction older than the fimage
transaction. But currently, it deletes all transactions and no check is
enforced in JNStorage#format().
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]