xinglin commented on code in PR #6183:
URL: https://github.com/apache/hadoop/pull/6183#discussion_r1392945651
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java:
##########
@@ -667,6 +700,9 @@ AsyncLoggerSet getLoggerSetForTests() {
@Override
public void doPreUpgrade() throws IOException {
+ if (isEnableJnMaintenance()) {
+ throw new IOException("doPreUpgrade() does not support enabling jn
maintenance mode");
Review Comment:
nit: doPreUpgrade() does not support enabling jn maintenance mode ->
doPreUpgrade() is not support while in jn maintenance mode
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java:
##########
@@ -684,6 +720,9 @@ public void doPreUpgrade() throws IOException {
@Override
public void doUpgrade(Storage storage) throws IOException {
+ if (isEnableJnMaintenance()) {
+ throw new IOException("doUpgrade() does not support enabling jn
maintenance mode");
Review Comment:
same here.
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:
##########
@@ -6333,6 +6333,20 @@
</description>
</property>
+ <property>
+ <name>dfs.journalnode.maintenance.nodes</name>
+ <value></value>
+ <description>
+ In the case of one out of three journal nodes being down, theoretically
the service can still
+ continue. However, in reality, the downed node may not recover quickly.
If the Namenode needs
+ to be restarted, it will try the downed journal node through the lengthy
RPC retry mechanism,
+ resulting in a long initialization time for the Namenode to provide
services. By adding the
+ downed journal node to the maintenance nodes, the initialization time of
the Namenode in such
+ scenarios can be accelerated.
Review Comment:
nit
->
In the case that one out of three journal nodes being down, theoretically
HDFS can still
function. However, in reality, the unavailable journal node may not
recover quickly. During this period, when we need to restart an Namenode, the
Namenode will try to connect to the unavailable journal node through the
lengthy RPC retry mechanism,
resulting in a long initialization time for the Namenode. By adding
these
unavailable journal nodes to the maintenance nodes, we will skip these
unavailable journal nodes during Namenode initialization and thus reduce
namenode startup time.
1-node example values: <>
2-node example values: <>
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java:
##########
@@ -1137,4 +1139,24 @@ public void testAddTransferRateMetricForInvalidValue() {
DFSUtil.addTransferRateMetric(mockMetrics, 100, 0);
verify(mockMetrics, times(0)).addReadTransferRate(anyLong());
}
+
+ @Test
+ public void testGetHostSet() {
+ String[] testAddrs = new String[] {NS1_NN_ADDR, NS1_NN1_ADDR};
Review Comment:
this test case is a bit confusing. Can we just use
"unreachable-host1.com:9000" instead?
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/client/TestQuorumJournalManager.java:
##########
@@ -1171,6 +1176,21 @@ public void testSelectViaRpcAfterJNRestart() throws
Exception {
}
}
+ /**
+ * Tests to throw an exception if the jn maintenance nodes exceeds half of
the journalnode number.
+ */
+ @Test
+ public void testJNMaintenanceListViaRpcTwoJNsError() throws Exception {
+ StringJoiner maintenanceListBuff = new StringJoiner(",");
+ for (int i = 0; i < 2; i++) {
+ maintenanceListBuff.add(
+
NetUtils.getHostPortString(cluster.getJournalNode(i).getBoundIpcAddress()));
+ }
+ this.conf.set(DFS_JOURNALNODE_MAINTENANCE_NODES_KEY,
maintenanceListBuff.toString());
+ assertThrows(IllegalArgumentException.class, () -> createSpyingQJM());
+ }
+
+
Review Comment:
I assume we also need a test case to to verify maintenance journal nodes are
indeed excluded? total JN=3, with 1 is excluded. after we initialize, verify
only 2 JNs are included.
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java:
##########
@@ -701,6 +740,9 @@ public void doUpgrade(Storage storage) throws IOException {
@Override
public void doFinalize() throws IOException {
+ if (isEnableJnMaintenance()) {
+ throw new IOException("doFinalize() does not support enabling jn
maintenance mode");
Review Comment:
same here. and for others functions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]