Steve Vaughan created HDFS-16690:
------------------------------------
Summary: Automatically format new unformatted JournalNodes using
JournalNodeSyncer
Key: HDFS-16690
URL: https://issues.apache.org/jira/browse/HDFS-16690
Project: Hadoop HDFS
Issue Type: Improvement
Components: journal-node
Environment: Demonstrated in a Kubernetes environment running Java 11.
# Start new cluster, but short 1 JN (minimum quorum, and the missing JN won’t
resolve). VERIFY:
- NN formats the 2 existing JN and stabilizes. NOTE: Formatting using just a
quorum will be a separate submission
- Messages show sync between JN-0 and JN-1, and NN -> JN.
# Scale JN stateful set to add missing JN. VERIFY:
- New JN starts
- All other JN and all NN report IP address change (IP Address resolution).
NOTE: require HADOOP-18365 and HDFS-16688
- Messages show sync between all JN, and NN -> JN
- New JN is formatted at least once (possibly by multiple other JN)
- New JN storage directory is formatted only once
- New JN joins cluster (lastWriterEpoch is non-zero)
Reporter: Steve Vaughan
If an unformatted JournalNode is added to an existing JournalNode set,
instances of the JournalNodeSyncer are unable to sync to the new node. When a
sync receives a JournalNotFormattedException, we can initiate a format
operation, and then retry the synchronization.
Conceptually this means that the JournalNodes and their data can be managed
independently from the rest of the system, as the JournalNodes will incorporate
new JournalNode instances. Once the new JournalNode is formatted, it can
participate in shared edits from the NameNodes.
I've been testing an update to the InterQJournalProtocol to add a format call
like that used by the NameNode. Current tests include starting an HA cluster
from scratch, but with 2 JournalNode instances. Once the cluster is up, I can
add the 3rd JournalNode (which is unformatted), and the other 2 JournalNodes
will eventually attempt to sync which results in a formatting and subsequent
sync.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]