[ 
https://issues.apache.org/jira/browse/HDFS-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820250#comment-17820250
 ] 

ASF GitHub Bot commented on HDFS-17396:
---------------------------------------

goiri commented on code in PR #6583:
URL: https://github.com/apache/hadoop/pull/6583#discussion_r1501301861


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster 
cluster,
     }
   }
 
+  public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+      int nnIdx, List<Integer> txids) {
+
+    for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+      LOG.info("examining name dir with files: " +
+                   Joiner.on(",").join(nameDir.listFiles()));
+      // Should have fsimage_N for the three checkpoints
+      LOG.info("Examining storage dir " + nameDir + " with contents: "
+                   + StringUtils.join(nameDir.listFiles(), ", "));
+      for (long checkpointTxId : txids) {
+        File image = new File(nameDir,
+            NNStorage.getRollbackImageFileName(checkpointTxId));
+        assertTrue("Expected non-empty " + image, image.length() > 0);

Review Comment:
   OK, I see this pattern already exists.



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java:
##########
@@ -208,12 +210,21 @@ public void testRollingUpgradeBootstrapStandby() throws 
Exception {
 
     // BootstrapStandby should fail if the node has a future version
     // and the cluster isn't in rolling upgrade
-    bs.setConf(cluster.getConfiguration(1));
+    bs.setConf(cluster.getConfiguration(2));
     assertEquals("BootstrapStandby should return ERR_CODE_INVALID_VERSION",
         ERR_CODE_INVALID_VERSION, bs.run(new String[]{"-force"}));
 
     // Start rolling upgrade
     fs.rollingUpgrade(RollingUpgradeAction.PREPARE);
+    RollingUpgradeInfo info = fs.rollingUpgrade(RollingUpgradeAction.QUERY);
+    while (!info.createdRollbackImages()) {

Review Comment:
   This is unbounded. Can you use LambdaTestUtils#wait or similar?



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java:
##########
@@ -253,6 +269,13 @@ public void testRollingUpgradeBootstrapStandby() throws 
Exception {
     // We should now be able to start the standby successfully
     restartNameNodesFromIndex(1, "-rollingUpgrade", "started");
 
+    for (int i = 1; i < maxNNCount; i++) {
+      assertTrue("NameNodes should all have the rollback FSImage",
+          cluster.getNameNode(i).getFSImage().hasRollbackFSImage());

Review Comment:
   Easier to read if you extract the `nn = cluster.getNameNode(i);`



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster 
cluster,
     }
   }
 
+  public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+      int nnIdx, List<Integer> txids) {
+
+    for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+      LOG.info("examining name dir with files: " +

Review Comment:
   Can you use the logger format?



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java:
##########
@@ -519,6 +519,23 @@ public static void assertNNHasCheckpoints(MiniDFSCluster 
cluster,
     }
   }
 
+  public static void assertNNHasRollbackCheckpoints(MiniDFSCluster cluster,
+      int nnIdx, List<Integer> txids) {
+
+    for (File nameDir : getNameNodeCurrentDirs(cluster, nnIdx)) {
+      LOG.info("examining name dir with files: " +
+                   Joiner.on(",").join(nameDir.listFiles()));
+      // Should have fsimage_N for the three checkpoints
+      LOG.info("Examining storage dir " + nameDir + " with contents: "
+                   + StringUtils.join(nameDir.listFiles(), ", "));
+      for (long checkpointTxId : txids) {
+        File image = new File(nameDir,
+            NNStorage.getRollbackImageFileName(checkpointTxId));
+        assertTrue("Expected non-empty " + image, image.length() > 0);

Review Comment:
   An assert in the regular code? Should this be an exception?





> BootstrapStandby should download rollback image during RollingUpgrade 
> ----------------------------------------------------------------------
>
>                 Key: HDFS-17396
>                 URL: https://issues.apache.org/jira/browse/HDFS-17396
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.4.0
>            Reporter: Danny Becker
>            Assignee: Danny Becker
>            Priority: Minor
>              Labels: pull-requests-available
>             Fix For: 3.4.0
>
>
> Due to [HDFS-17178|https://issues.apache.org/jira/browse/HDFS-17178], 
> BootstrapStandby can now succeed during a RollingUpgrade. This can lead to an 
> edge case where the bootstrapped NameNode does not have the 
> fsimage_rollback_N file locally so it does not recognize that it is in a 
> RollingUpgrade. This means the bootstrapped NameNode will overwrite the 
> VERSION file during a checkpoint because it does not know that the 
> NameService is in a RollingUpgrade.
>  
> BootstrapStandby needs to download the fsimage_rollback_N file in addition to 
> the normal fsimage_N file so that the bootstrapped node correctly recognizes 
> the RollingUpgrade state of the NameService.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to