errose28 commented on code in PR #3741:
URL: https://github.com/apache/ozone/pull/3741#discussion_r1014416675


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -416,6 +418,8 @@ ContainerCommandResponseProto handleDeleteContainer(
       deleteInternal(kvContainer, forceDelete);
     } catch (StorageContainerException ex) {
       return ContainerUtils.logAndReturnError(LOG, ex, request);
+    } catch (IOException ex) {

Review Comment:
   I don't think we need the new exception type. If the move operation fails 
the container remains on the disk, but SCM won't get an exception back saying 
it was not deleted. This means the operation will never be retried until the DN 
sends the next full container report. #3788 will be doing container and volume 
scans on errors like these in the future to determine if those need to be 
marked unhealthy, but for now letting SCM know the delete failed is consistent 
with the current behavior.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/TestHddsDatanodeService.java:
##########
@@ -71,21 +124,76 @@ public void tearDown() {
     FileUtil.fullyDelete(testDir);
   }
 
-  @Test
-  public void testStartup() throws IOException {
-    service = HddsDatanodeService.createHddsDatanodeService(args);
+  @ParameterizedTest
+  @ValueSource(strings = {OzoneConsts.SCHEMA_V1,
+      OzoneConsts.SCHEMA_V2, OzoneConsts.SCHEMA_V3})

Review Comment:
   I think the testing of temp dir cleanup should be moved to its own named 
test in this class, since the bulk of the code here currently is about that 
instead of just a general startup test.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/impl/TestContainerPersistence.java:
##########
@@ -286,19 +288,161 @@ public void testDeleteContainer() throws Exception {
     BlockData someKey1 = new BlockData(blockID1);
     someKey1.setChunks(new LinkedList<ContainerProtos.ChunkInfo>());
     blockManager.putBlock(container1, someKey1);
+  }
+
+  @Test
+  public void testDeleteNonEmptyContainer() throws Exception {
+    long testContainerID2 = getTestContainerID();
+    Container container2 = addContainer(containerSet, testContainerID2);
+    container2.close();
+
+    Assert.assertTrue(containerSet.getContainerMapCopy()
+        .containsKey(testContainerID2));
+
+    // With schema v3, we don't have a container dedicated db,
+    // so skip check the behaviors related to it.
+    assumeFalse(schemaVersion.contains(OzoneConsts.SCHEMA_V3));
 
     // Deleting a non-empty container should fail.
     BlockID blockID2 = ContainerTestHelper.getTestBlockID(testContainerID2);
     BlockData someKey2 = new BlockData(blockID2);
     someKey2.setChunks(new LinkedList<ContainerProtos.ChunkInfo>());
     blockManager.putBlock(container2, someKey2);
 
+    // KeyValueHandler setup
+    String datanodeId = UUID.randomUUID().toString();
+
+    int[] interval = new int[1];
+    interval[0] = 2;
+    ContainerMetrics metrics = new ContainerMetrics(interval);
+
+    AtomicInteger icrReceived = new AtomicInteger(0);

Review Comment:
   Do we want to test the value of this somewhere?



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/VersionEndpointTask.java:
##########
@@ -122,6 +124,19 @@ private void checkVolumeSet(MutableVolumeSet volumeSet,
         boolean result = StorageVolumeUtil.checkVolume(volume,
             scmId, clusterId, configuration, LOG,
             ozoneContainer.getDbVolumeSet());
+
+        // Clean <HddsVolume>/tmp/container_delete_service dir.
+        if (volume instanceof HddsVolume) {

Review Comment:
   We should not run this block if `result` in the above line comes back false. 
If it comes back true and we get an exception while cleaning tmp, we need to 
decide whether or not this should fail the volume. I would say to not fail the 
volume, as the tmp dir being clean on startup is not required for 
functionality. If there's further issues as the DN is running the volume 
scanner will handle it more thoroughly.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -395,4 +416,174 @@ public static Path getMetadataDirectory(
     return Paths.get(metadataPath);
 
   }
+
+  /**
+   * Utilities for container_delete_service directory
+   * which is located under <volume>/hdds/<cluster-id>/tmp/.
+   * Containers will be moved under it before getting deleted
+   * to avoid, in case of failure, having artifact leftovers
+   * on the default container path on the disk.
+   */
+  public static class ContainerDeleteDirectory {
+
+    /**
+     * Delete all files under
+     * <volume>/hdds/<cluster-id>/tmp/container_delete_service.
+     */
+    public static synchronized void cleanTmpDir(HddsVolume hddsVolume)
+        throws IOException {
+      if (hddsVolume.getStorageState() != StorageVolume.VolumeState.NORMAL) {
+        LOG.debug("Call to clean tmp dir container_delete_service directory "
+                + "for {} while VolumeState {}",
+            hddsVolume.getStorageDir(),
+            hddsVolume.getStorageState().toString());
+        return;
+      }
+
+      if (!hddsVolume.getClusterID().isEmpty()) {
+        // Initialize delete directory
+        hddsVolume.createDeleteServiceDir();
+      } else {
+        throw new IOException("Volume has no ClusterId");
+      }
+
+      ListIterator<File> leftoversListIt = getDeleteLeftovers(hddsVolume);
+
+      while (leftoversListIt.hasNext()) {
+        File file = leftoversListIt.next();
+
+        // If SchemaV3 is enabled and we have a RocksDB
+        if (VersionedDatanodeFeatures.isFinalized(
+            HDDSLayoutFeature.DATANODE_SCHEMA_V3)) {
+          // Get container file
+          File containerFile = getContainerFile(file);
+
+          // If file exists
+          if (containerFile != null) {
+            ContainerData containerData = ContainerDataYaml
+                .readContainerFile(containerFile);
+            KeyValueContainerData keyValueContainerData =
+                (KeyValueContainerData) containerData;
+
+            // Remove container from Rocks DB
+            String dbPath = hddsVolume.getDbParentDir().getAbsolutePath();
+            DatanodeStoreSchemaThreeImpl store =
+                new DatanodeStoreSchemaThreeImpl(hddsVolume.getConf(),
+                    dbPath, false);
+            
store.removeKVContainerData(keyValueContainerData.getContainerID());

Review Comment:
   Looks like we can call `BlockUtils#removeContainerFromDB` here instead. We 
have a `KeyValueContainerData` and the two places calling this method have 
configuration objects they can pass down.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -395,4 +416,174 @@ public static Path getMetadataDirectory(
     return Paths.get(metadataPath);
 
   }
+
+  /**
+   * Utilities for container_delete_service directory
+   * which is located under <volume>/hdds/<cluster-id>/tmp/.
+   * Containers will be moved under it before getting deleted
+   * to avoid, in case of failure, having artifact leftovers
+   * on the default container path on the disk.
+   */
+  public static class ContainerDeleteDirectory {
+
+    /**
+     * Delete all files under
+     * <volume>/hdds/<cluster-id>/tmp/container_delete_service.
+     */
+    public static synchronized void cleanTmpDir(HddsVolume hddsVolume)
+        throws IOException {
+      if (hddsVolume.getStorageState() != StorageVolume.VolumeState.NORMAL) {
+        LOG.debug("Call to clean tmp dir container_delete_service directory "
+                + "for {} while VolumeState {}",
+            hddsVolume.getStorageDir(),
+            hddsVolume.getStorageState().toString());
+        return;
+      }
+
+      if (!hddsVolume.getClusterID().isEmpty()) {
+        // Initialize delete directory
+        hddsVolume.createDeleteServiceDir();
+      } else {
+        throw new IOException("Volume has no ClusterId");
+      }
+
+      ListIterator<File> leftoversListIt = getDeleteLeftovers(hddsVolume);
+
+      while (leftoversListIt.hasNext()) {
+        File file = leftoversListIt.next();
+
+        // If SchemaV3 is enabled and we have a RocksDB
+        if (VersionedDatanodeFeatures.isFinalized(
+            HDDSLayoutFeature.DATANODE_SCHEMA_V3)) {
+          // Get container file
+          File containerFile = getContainerFile(file);
+
+          // If file exists
+          if (containerFile != null) {
+            ContainerData containerData = ContainerDataYaml
+                .readContainerFile(containerFile);
+            KeyValueContainerData keyValueContainerData =
+                (KeyValueContainerData) containerData;
+
+            // Remove container from Rocks DB
+            String dbPath = hddsVolume.getDbParentDir().getAbsolutePath();
+            DatanodeStoreSchemaThreeImpl store =
+                new DatanodeStoreSchemaThreeImpl(hddsVolume.getConf(),
+                    dbPath, false);
+            
store.removeKVContainerData(keyValueContainerData.getContainerID());
+          }
+        }
+
+        try {
+          if (file.isDirectory()) {
+            FileUtils.deleteDirectory(file);
+          } else {
+            FileUtils.delete(file);
+          }
+        } catch (IOException ex) {
+          LOG.error("Failed to delete directory or file inside " +
+              "{}", hddsVolume.getDeleteServiceDirPath().toString(), ex);
+        }
+      }
+    }
+
+    /**
+     * Search recursively for the container file under a
+     * directory. Return null if the file is not found.
+     * @param file
+     * @return container file or null if it doesn't exist
+     * @throws IOException
+     */
+    public static File getContainerFile(File file) throws IOException {
+      try {
+        if (file.isDirectory()) {
+          for (File subFile : file.listFiles()) {
+            if (subFile.isDirectory()) {
+              File containerFile = getContainerFile(subFile);
+              if (containerFile != null) {
+                return containerFile;
+              }
+            } else {
+              if (FilenameUtils.getExtension(subFile.getName())
+                  .equals("container")) {
+                return subFile;
+              }
+            }
+          }
+        } else {
+          if (FilenameUtils.getExtension(file.getName())
+              .equals("container")) {
+            return file;
+          }
+        }
+      } catch (NullPointerException ex) {
+        LOG.error("File object is null.", ex);
+      }
+      return null;
+    }
+
+    /**
+     * In the future might be used to gather metrics
+     * for the files left under
+     * <volume>/hdds/<cluster-id>/tmp/container_delete_service.
+     * @return ListIterator to all the files
+     */
+    public static ListIterator<File> getDeleteLeftovers(HddsVolume hddsVolume) 
{
+      List<File> leftovers = new ArrayList<>();
+
+      try {
+        File tmpDir = new 
File(hddsVolume.getDeleteServiceDirPath().toString());
+
+        if (tmpDir.exists()) {
+          for (File file : tmpDir.listFiles()) {
+            leftovers.add(file);
+          }
+        }
+      } catch (NullPointerException ex) {
+        LOG.error("tmp delete directory is null, path doesn't exist", ex);
+      }
+
+      ListIterator<File> leftoversListIt = leftovers.listIterator();
+      return leftoversListIt;

Review Comment:
   ```suggestion
          List<File> leftovers = new ArrayList<>();
         File tmpDir = hddsVolume.getDeleteServiceDirPath().toFile();
   
         if (tmpDir.exists() && tmpDir.isDirectory()) {
           File[] tmpFiles = tmpDir.listFiles();
           if (tmpFiles != null) {
             leftovers.addAll(Arrays.asList(tmpFiles));
           }
         }
   
         return leftovers.listIterator();
   ```
   
   Writing the method this way removes many intellij warnings that are 
otherwise present, and does not require catching an unchecked exception which 
is usually bad practice.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1135,6 +1139,27 @@ private void deleteInternal(Container container, boolean 
force)
               DELETE_ON_NON_EMPTY_CONTAINER);
         }
       }
+      if (container.getContainerData() instanceof KeyValueContainerData) {
+        KeyValueContainerData keyValueContainerData =
+            (KeyValueContainerData) container.getContainerData();
+        HddsVolume hddsVolume = keyValueContainerData.getVolume();
+
+        // Rename container location
+        boolean success = KeyValueContainerUtil.ContainerDeleteDirectory
+            .moveToTmpDeleteDirectory(keyValueContainerData, hddsVolume);
+
+        if (success) {
+          String containerPath = keyValueContainerData
+              .getContainerPath().toString();
+          File containerDir = new File(containerPath);
+
+          LOG.info("Container {} has been successfuly moved under {}",

Review Comment:
   nit. This might be better as a debug log since it is just confirming that 
expected behavior happened.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -395,4 +416,174 @@ public static Path getMetadataDirectory(
     return Paths.get(metadataPath);
 
   }
+
+  /**
+   * Utilities for container_delete_service directory
+   * which is located under <volume>/hdds/<cluster-id>/tmp/.
+   * Containers will be moved under it before getting deleted
+   * to avoid, in case of failure, having artifact leftovers
+   * on the default container path on the disk.
+   */
+  public static class ContainerDeleteDirectory {
+
+    /**
+     * Delete all files under
+     * <volume>/hdds/<cluster-id>/tmp/container_delete_service.
+     */
+    public static synchronized void cleanTmpDir(HddsVolume hddsVolume)
+        throws IOException {
+      if (hddsVolume.getStorageState() != StorageVolume.VolumeState.NORMAL) {
+        LOG.debug("Call to clean tmp dir container_delete_service directory "
+                + "for {} while VolumeState {}",
+            hddsVolume.getStorageDir(),
+            hddsVolume.getStorageState().toString());
+        return;
+      }
+
+      if (!hddsVolume.getClusterID().isEmpty()) {
+        // Initialize delete directory
+        hddsVolume.createDeleteServiceDir();
+      } else {
+        throw new IOException("Volume has no ClusterId");
+      }
+
+      ListIterator<File> leftoversListIt = getDeleteLeftovers(hddsVolume);
+
+      while (leftoversListIt.hasNext()) {
+        File file = leftoversListIt.next();
+
+        // If SchemaV3 is enabled and we have a RocksDB
+        if (VersionedDatanodeFeatures.isFinalized(

Review Comment:
   Might be good to put a comment block here somewhat sucinctly summarizing 
[this 
comment](https://github.com/apache/ozone/pull/3741#issuecomment-1258673684). 
It's not very intuitive why we need to check and remove RocksDB entries here 
otherwise.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestTarContainerPacker.java:
##########
@@ -135,6 +136,11 @@ private KeyValueContainerData createContainer(Path dir) 
throws IOException {
     Files.createDirectories(dbDir);
     Files.createDirectories(dataDir);
 
+    if (schemaVersion.equals(SCHEMA_V3)) {
+      Path schemaV3DBDir = metaDir.resolve("db");
+      Files.createDirectories(schemaV3DBDir);
+    }
+

Review Comment:
   This test passed locally for me without these changes present. Are they 
needed?



##########
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/ozone/container/common/TestEndPoint.java:
##########
@@ -137,16 +142,47 @@ public void testGetVersion() throws Exception {
   /**
    * We make getVersion RPC call, but via the VersionEndpointTask which is
    * how the state machine would make the call.
+   * Writing data to tmp dir and checking if it has been cleaned.

Review Comment:
   This should probably be moved to its own test method for clarity.



##########
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/TestHddsDatanodeService.java:
##########
@@ -33,24 +51,59 @@
 import static 
org.apache.hadoop.hdds.HddsConfigKeys.HDDS_CONTAINER_TOKEN_ENABLED;
 import static 
org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_SECURITY_ENABLED_KEY;
 import static org.junit.jupiter.api.Assertions.assertFalse;
+import static org.junit.jupiter.api.Assertions.assertTrue;
 import static org.junit.jupiter.api.Assertions.assertNotNull;
 import static org.junit.jupiter.api.Assertions.assertNull;
 import org.junit.jupiter.api.BeforeEach;
-import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.ValueSource;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * Test class for {@link HddsDatanodeService}.
  */
+
 public class TestHddsDatanodeService {
+
+  @TempDir
+  private File tempDir;
+
+  private static final Logger LOG =
+      LoggerFactory.getLogger(TestHddsDatanodeService.class);
+
+  private final String clusterId = UUID.randomUUID().toString();
+  private final OzoneConfiguration conf = new OzoneConfiguration();
+  private static final int SCM_SERVER_COUNT = 1;
+  private static final String FILE_SEPARATOR = File.separator;
+
   private File testDir;
-  private OzoneConfiguration conf;
-  private HddsDatanodeService service;
-  private String[] args = new String[] {};
+  private List<String> serverAddresses;
+  private List<RPC.Server> scmServers;
+  private List<ScmTestMock> mockServers;

Review Comment:
   Check on some of the warnings showing after the changes and see if the 
suggestions can be incorporated. For example, I'm seeing "contents updated but 
never queried" on these last two fields, and similarly on `hddsDirs` below. 
Basically looks like the variables are unused.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java:
##########
@@ -395,4 +416,174 @@ public static Path getMetadataDirectory(
     return Paths.get(metadataPath);
 
   }
+
+  /**
+   * Utilities for container_delete_service directory
+   * which is located under <volume>/hdds/<cluster-id>/tmp/.
+   * Containers will be moved under it before getting deleted
+   * to avoid, in case of failure, having artifact leftovers
+   * on the default container path on the disk.
+   */
+  public static class ContainerDeleteDirectory {
+
+    /**
+     * Delete all files under
+     * <volume>/hdds/<cluster-id>/tmp/container_delete_service.
+     */
+    public static synchronized void cleanTmpDir(HddsVolume hddsVolume)
+        throws IOException {
+      if (hddsVolume.getStorageState() != StorageVolume.VolumeState.NORMAL) {
+        LOG.debug("Call to clean tmp dir container_delete_service directory "
+                + "for {} while VolumeState {}",
+            hddsVolume.getStorageDir(),
+            hddsVolume.getStorageState().toString());
+        return;
+      }
+
+      if (!hddsVolume.getClusterID().isEmpty()) {
+        // Initialize delete directory
+        hddsVolume.createDeleteServiceDir();
+      } else {
+        throw new IOException("Volume has no ClusterId");
+      }
+
+      ListIterator<File> leftoversListIt = getDeleteLeftovers(hddsVolume);
+
+      while (leftoversListIt.hasNext()) {
+        File file = leftoversListIt.next();
+
+        // If SchemaV3 is enabled and we have a RocksDB
+        if (VersionedDatanodeFeatures.isFinalized(
+            HDDSLayoutFeature.DATANODE_SCHEMA_V3)) {
+          // Get container file
+          File containerFile = getContainerFile(file);
+
+          // If file exists
+          if (containerFile != null) {
+            ContainerData containerData = ContainerDataYaml
+                .readContainerFile(containerFile);
+            KeyValueContainerData keyValueContainerData =
+                (KeyValueContainerData) containerData;
+
+            // Remove container from Rocks DB
+            String dbPath = hddsVolume.getDbParentDir().getAbsolutePath();
+            DatanodeStoreSchemaThreeImpl store =
+                new DatanodeStoreSchemaThreeImpl(hddsVolume.getConf(),
+                    dbPath, false);
+            
store.removeKVContainerData(keyValueContainerData.getContainerID());
+          }
+        }
+
+        try {
+          if (file.isDirectory()) {
+            FileUtils.deleteDirectory(file);
+          } else {
+            FileUtils.delete(file);
+          }
+        } catch (IOException ex) {
+          LOG.error("Failed to delete directory or file inside " +
+              "{}", hddsVolume.getDeleteServiceDirPath().toString(), ex);
+        }
+      }
+    }
+
+    /**
+     * Search recursively for the container file under a
+     * directory. Return null if the file is not found.
+     * @param file
+     * @return container file or null if it doesn't exist
+     * @throws IOException
+     */
+    public static File getContainerFile(File file) throws IOException {

Review Comment:
   Can we use `ContainerUtils#getContainerFile` instead of this new method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to