[
https://issues.apache.org/jira/browse/HDFS-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820250#comment-17820250
]
ASF GitHub Bot commented on HDFS-17396:
---------------------------------------
goiri commented on code in PR #6583:
URL: https://github.com/apache/hadoop/pull/6583#discussion_r1501301861
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster
cluster,
}
}
+ public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+ int nnIdx, List<Integer> txids) {
+
+ for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+ LOG.info("examining name dir with files: " +
+ Joiner.on(",").join(nameDir.listFiles()));
+ // Should have fsimage_N for the three checkpoints
+ LOG.info("Examining storage dir " + nameDir + " with contents: "
+ + StringUtils.join(nameDir.listFiles(), ", "));
+ for (long checkpointTxId : txids) {
+ File image = new File(nameDir,
+ NNStorage.getRollbackImageFileName(checkpointTxId));
+ assertTrue("Expected non-empty " + image, image.length() > 0);
Review Comment:
OK, I see this pattern already exists.
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java:
##########
@@ -208,12 +210,21 @@ public void testRollingUpgradeBootstrapStandby() throws
Exception {
// BootstrapStandby should fail if the node has a future version
// and the cluster isn't in rolling upgrade
- bs.setConf(cluster.getConfiguration(1));
+ bs.setConf(cluster.getConfiguration(2));
assertEquals("BootstrapStandby should return ERR_CODE_INVALID_VERSION",
ERR_CODE_INVALID_VERSION, bs.run(new String[]{"-force"}));
// Start rolling upgrade
fs.rollingUpgrade(RollingUpgradeAction.PREPARE);
+ RollingUpgradeInfo info = fs.rollingUpgrade(RollingUpgradeAction.QUERY);
+ while (!info.createdRollbackImages()) {
Review Comment:
This is unbounded. Can you use LambdaTestUtils#wait or similar?
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java:
##########
@@ -253,6 +269,13 @@ public void testRollingUpgradeBootstrapStandby() throws
Exception {
// We should now be able to start the standby successfully
restartNameNodesFromIndex(1, "-rollingUpgrade", "started");
+ for (int i = 1; i < maxNNCount; i++) {
+ assertTrue("NameNodes should all have the rollback FSImage",
+ cluster.getNameNode(i).getFSImage().hasRollbackFSImage());
Review Comment:
Easier to read if you extract the `nn = cluster.getNameNode(i);`
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster
cluster,
}
}
+ public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+ int nnIdx, List<Integer> txids) {
+
+ for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+ LOG.info("examining name dir with files: " +
Review Comment:
Can you use the logger format?
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster
cluster,
}
}
+ public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+ int nnIdx, List<Integer> txids) {
+
+ for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+ LOG.info("examining name dir with files: " +
+ Joiner.on(",").join(nameDir.listFiles()));
+ // Should have fsimage_N for the three checkpoints
+ LOG.info("Examining storage dir " + nameDir + " with contents: "
+ + StringUtils.join(nameDir.listFiles(), ", "));
+ for (long checkpointTxId : txids) {
+ File image = new File(nameDir,
+ NNStorage.getRollbackImageFileName(checkpointTxId));
+ assertTrue("Expected non-empty " + image, image.length() > 0);
Review Comment:
An assert in the regular code? Should this be an exception?
> BootstrapStandby should download rollback image during RollingUpgrade
> ----------------------------------------------------------------------
>
> Key: HDFS-17396
> URL: https://issues.apache.org/jira/browse/HDFS-17396
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 3.4.0
> Reporter: Danny Becker
> Assignee: Danny Becker
> Priority: Minor
> Labels: pull-requests-available
> Fix For: 3.4.0
>
>
> Due to [HDFS-17178|https://issues.apache.org/jira/browse/HDFS-17178],
> BootstrapStandby can now succeed during a RollingUpgrade. This can lead to an
> edge case where the bootstrapped NameNode does not have the
> fsimage_rollback_N file locally so it does not recognize that it is in a
> RollingUpgrade. This means the bootstrapped NameNode will overwrite the
> VERSION file during a checkpoint because it does not know that the
> NameService is in a RollingUpgrade.
>
> BootstrapStandby needs to download the fsimage_rollback_N file in addition to
> the normal fsimage_N file so that the bootstrapped node correctly recognizes
> the RollingUpgrade state of the NameService.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]