Copilot commented on code in PR #9132:
URL: https://github.com/apache/ozone/pull/9132#discussion_r2566417622


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotUtils.java:
##########
@@ -122,7 +124,17 @@ public static Path createHardLinkList(int truncateLength,
   public static void createHardLinks(Path dbPath, boolean deleteSourceFiles) 
throws IOException {
     File hardLinkFile =
         new File(dbPath.toString(), OmSnapshotManager.OM_HARDLINK_FILE);
-    List<Path> filesToDelete = new ArrayList<>();
+    Path dbPathParent = dbPath.getParent();
+    if (dbPathParent == null) {
+      throw new IOException("Invalid dbPath: parent is null: " + dbPath);
+    }
+    Path checkpointDataDirPath = Paths.get(dbPathParent.toString(), 
OM_CHECKPOINT_DATA_DIR);
+    if (!checkpointDataDirPath.toFile().exists()) {
+      boolean dirCreated = checkpointDataDirPath.toFile().mkdirs();
+      if (!dirCreated) {
+        LOG.error("Failed to create directory: {}", checkpointDataDirPath);

Review Comment:
   When `mkdirs()` returns `false`, the code logs an error but continues 
execution. This could lead to subsequent failures when trying to create hard 
links in a non-existent directory. Consider throwing an `IOException` if 
directory creation fails, similar to the pattern used at lines 157-160.
   ```suggestion
           throw new IOException("Failed to create directory: " + 
checkpointDataDirPath);
   ```



##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/OzoneManagerServiceProviderImpl.java:
##########
@@ -471,15 +470,13 @@ public DBCheckpoint getOzoneManagerDBSnapshot() {
         try (InputStream inputStream = reconUtils.makeHttpCall(
             connectionFactory, getOzoneManagerSnapshotUrl(), 
isOmSpnegoEnabled()).getInputStream()) {
           tarExtractor.extractTar(inputStream, untarredDbDir);

Review Comment:
   The removal of `OmSnapshotUtils.createHardLinks()` call in Recon means hard 
links are no longer being created client-side after extracting the tarball. 
However, the PR changes the hard link file format to have all paths relative to 
the metadata directory. If the OM is sending a tarball with a hardlink file 
(which the V1 endpoint `/dbCheckpoint` does via `OMDBCheckpointServlet`), but 
Recon is not processing it, the extracted checkpoint may be incomplete or 
missing deduplicated files. Either:
   1. Recon should still call `createHardLinks()` to process the hardlink file, 
OR
   2. The endpoint being used should include all files directly without 
requiring client-side hard link creation (which would be a significant change 
in behavior not clearly documented in this PR)
   
   Please verify this is the intended behavior and that the checkpoint 
extracted by Recon will be complete.



##########
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/snapshot/TestOmSnapshotUtils.java:
##########
@@ -75,4 +76,44 @@ public void testLinkFiles(@TempDir File tempDir) throws 
Exception {
 
     assertEquals(tree1Files, tree2Files);
   }
+
+  /**
+   * Test createHardLinks().
+   */
+  @Test
+  public void testCreateHardLinksWithOmDbPrefix(@TempDir File tempDir) throws 
Exception {
+    // Create a test dir inside temp dir
+    File testDir  = new File(tempDir, "testDir");
+    assertTrue(testDir.mkdir(), "Failed to create test dir");
+    // Create source file
+    File sourceFile = new File(testDir, "source.sst");
+    File checkpointDataDir = new File(testDir.getParent(), 
OzoneConsts.OM_CHECKPOINT_DATA_DIR);
+    Files.write(sourceFile.toPath(), "test content".getBytes(UTF_8));
+
+    // Create hardlink file with "om.db/" prefixed paths
+    File hardlinkFile = new File(testDir, "hardLinkFile"); // 
OmSnapshotManager.OM_HARDLINK_FILE
+    String hardlinkContent =
+        "om.db/target1.sst\tsource.sst\n" +  // Should strip om.db/ prefix

Review Comment:
   The comment "Should strip om.db/ prefix" is misleading. According to the PR 
description, the change is to maintain uniformity by keeping all paths relative 
to the metadata directory, not stripping the prefix. The test expects the file 
to be created at `om.db/target1.sst` within the checkpoint_data directory (line 
105), which means the prefix is NOT stripped. The comment should say something 
like "Should preserve om.db/ prefix as relative path" to accurately reflect the 
behavior.
   ```suggestion
           "om.db/target1.sst\tsource.sst\n" +  // Should preserve om.db/ 
prefix as relative path
   ```



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -4072,10 +4085,27 @@ TermIndex installCheckpoint(String leaderId, 
DBCheckpoint omDBCheckpoint)
     return installCheckpoint(leaderId, checkpointLocation, checkpointTrxnInfo);

Review Comment:
   The method `installCheckpoint(String leaderId, DBCheckpoint omDBCheckpoint)` 
at line 4075 is now broken. It passes `checkpointLocation` (line 4085) to the 
3-argument `installCheckpoint` method, but that method now expects a 
`checkpointDataDir` as the second parameter (see line 4104 and its 
documentation at lines 4088-4092). The 3-argument method will try to construct 
a path `Paths.get(checkpointDataDir.toString(), OM_DB_NAME)` at line 4108, 
which would incorrectly look for `om.db` inside the checkpoint location. This 
method either needs to be updated to create a proper `checkpointDataDir` 
structure before calling the 3-argument version, or it should be 
deprecated/removed if it's no longer used.
   ```suggestion
       Path parent = checkpointLocation.getParent();
       if (parent == null) {
         throw new IOException("Cannot install checkpoint: checkpointLocation 
has no parent: " + checkpointLocation);
       }
       Path checkpointDataDir = Paths.get(parent.toString(), 
OM_CHECKPOINT_DATA_DIR);
       return installCheckpoint(leaderId, checkpointDataDir, 
checkpointTrxnInfo);
   ```



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -4313,13 +4341,37 @@ private void moveCheckpointFiles(File oldDB, Path 
checkpointPath, File dbDir,
     }
   }
 
-  // Move the new snapshot directory into place.
-  private void moveOmSnapshotData(Path dbPath, Path dbSnapshotsDir)
-      throws IOException {
-    Path incomingSnapshotsDir = Paths.get(dbPath.toString(),
-        OM_SNAPSHOT_DIR);
-    if (incomingSnapshotsDir.toFile().exists()) {
-      Files.move(incomingSnapshotsDir, dbSnapshotsDir);
+  /**
+   * Replaces the contents of the target directory with contents from the 
source directory.
+   * Deletes all existing files in dbDir and moves all files from newDbDir 
into dbDir.
+   *
+   * @param dbDir target directory whose contents will be replaced
+   * @param newDbDir source directory containing new contents to move
+   * @throws IOException if file operations fail
+   */
+  private void replaceDir(File dbDir, Path newDbDir) throws IOException {
+    // First, delete all existing content in the target directory
+    if (dbDir.exists()) {
+      // Delete all files and subdirectories in dbDir, but keep the directory 
itself
+      File[] files = dbDir.listFiles();
+      if (files != null) {
+        for (File file : files) {
+          FileUtil.fullyDelete(file);
+        }
+      }
+    } else {
+      // Create the directory if it doesn't exist
+      Files.createDirectories(dbDir.toPath());
+    }
+
+    // Move all content from newDbDir to dbDir
+    if (Files.exists(newDbDir) && Files.isDirectory(newDbDir)) {
+      try (Stream<Path> files = Files.list(newDbDir)) {
+        for (Path sourceFile : files.collect(Collectors.toList())) {
+          Path targetFile = dbDir.toPath().resolve(sourceFile.getFileName());
+          Files.move(sourceFile, targetFile, 
StandardCopyOption.REPLACE_EXISTING);
+        }
+      }
     }

Review Comment:
   The `replaceDir` method doesn't handle the case where `newDbDir` doesn't 
exist or is not a directory. If the source directory is missing, the method 
will silently do nothing after clearing the target directory, potentially 
leaving the system in an inconsistent state. Consider adding validation to 
ensure `newDbDir` exists and is a directory before clearing `dbDir`, or throw 
an informative exception if the source is invalid.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -4269,34 +4295,36 @@ File replaceOMDBWithCheckpoint(long lastAppliedIndex, 
File oldDB,
           dbSnapshotsBackup.toPath());
     }
 
-    moveCheckpointFiles(oldDB, checkpointPath, dbDir, dbBackup, dbSnapshotsDir,
+    moveCheckpointFiles(oldDB, checkpointDataDir, dbDir, dbBackup, 
dbSnapshotsDir,
         dbSnapshotsBackup);
     return dbBackupDir;
   }
 
-  private void moveCheckpointFiles(File oldDB, Path checkpointPath, File dbDir,
+  private void moveCheckpointFiles(File oldDB, Path checkpointDataDir, File 
dbDir,
                                    File dbBackup, File dbSnapshotsDir,
                                    File dbSnapshotsBackup) throws IOException {
-    // Move the new DB checkpoint into the om metadata dir
     Path markerFile = new File(dbDir, DB_TRANSIENT_MARKER).toPath();
+    Path newDb = Paths.get(checkpointDataDir.toString(), OM_DB_NAME);
+    Path newDbSnapshotsDir = Paths.get(checkpointDataDir.toString(), 
OM_SNAPSHOT_DIR);
+    // Now that we have backed up the old DB data, we just need to move the
+    // directories in checkpoint_data to the main path
+    // 1. replace oldDb with newDb
+    // 2. replace dbSnapshotsDir with newDbSnapshotsDir
     try {
       // Create a Transient Marker file. This file will be deleted if the
       // checkpoint DB is successfully moved to the old DB location or if the
       // old DB backup is reset to its location. If not, then the OM DB is in
       // an inconsistent state and this marker file will fail OM from
       // starting up.
       Files.createFile(markerFile);
-      // Link each of the candidate DB files to real DB directory.  This
-      // preserves the links that already exist between files in the
-      // candidate db.
-      OmSnapshotUtils.linkFiles(checkpointPath.toFile(),
-          oldDB);
-      moveOmSnapshotData(oldDB.toPath(), dbSnapshotsDir.toPath());
+      // replace the dbDir and dbSnapshotDir
+      replaceDir(oldDB, newDb);
+      replaceDir(dbSnapshotsDir, newDbSnapshotsDir);
       Files.deleteIfExists(markerFile);
     } catch (IOException e) {
       LOG.error("Failed to move downloaded DB checkpoint {} to metadata " +
-              "directory {}. Exception: {}. Resetting to original DB.",
-          checkpointPath, oldDB.toPath(), e);
+              "directory . Exception: {}. Resetting to original DB.",

Review Comment:
   The error message has an extra space before the period: "directory . 
Exception:". Should be "directory. Exception:" for consistency with standard 
formatting.
   ```suggestion
                 "directory. Exception: {}. Resetting to original DB.",
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to