errose28 commented on code in PR #7065:
URL: https://github.com/apache/ozone/pull/7065#discussion_r1717271381


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -516,6 +517,7 @@ ContainerCommandResponseProto handleCloseContainer(
     try {
       markContainerForClose(kvContainer);
       closeContainer(kvContainer);
+      createContainerMerkleTree(kvContainer);

Review Comment:
   We want to generate the tree after the state is moved to CLOSED, but before 
the ICR for the CLOSED container is sent to SCM. This way tree generation will 
be retried and every CLOSED container will have a tree. For unhealthy or 
quasi-closed containers, we should not generate the tree automatically from 
RocksDB and instead let the scanner determine the container's current state.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/TestCloseContainer.java:
##########
@@ -159,11 +183,72 @@ public void testCloseClosedContainer()
     StorageContainerManager scm = cluster.getStorageContainerManager();
     // Pick any container on the cluster and close it via client
     ContainerInfo container = scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, container));
+    }
+    // Close container
     OzoneTestUtils.closeContainer(scm, container);
+
+    // Checksum file exists after container close
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      GenericTestUtils.waitFor(() -> 
waitForContainerChecksumToExist(hddsDatanode, container), 100, 5000);
+      assertTrue(containerChecksumFileExists(hddsDatanode, container));
+    }
+
     assertThrows(IOException.class,
         () -> cluster.getStorageContainerLocationClient()
             .closeContainer(container.getContainerID()),
         "Container " + container.getContainerID() + " already closed");
   }
 
+  @Test
+  public void testContainerChecksumForClosedContainer() throws Exception {
+    // Create some keys to write data into the open containers
+    for (int i = 0; i < 10; i++) {
+      TestDataUtil.createKey(bucket, "key" + i, ReplicationFactor.THREE,
+          ReplicationType.RATIS, "this is the content");
+    }
+    StorageContainerManager scm = cluster.getStorageContainerManager();
+    ContainerInfo containerInfo = 
scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode : hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, containerInfo));
+    }
+
+    // Build expected merkle tree
+    HddsDatanodeService hddsDatanodeService = 
cluster.getHddsDatanodes().get(0);
+    OzoneContainer ozoneContainer = 
hddsDatanodeService.getDatanodeStateMachine().getContainer();
+    KeyValueContainer container = (KeyValueContainer) 
ozoneContainer.getController()
+        .getContainer(containerInfo.getContainerID());
+    ContainerProtos.ContainerMerkleTree expectedMerkleTree = 
buildContainerMerkleTree(
+        container.getContainerData(), cluster.getConf());
+    OzoneTestUtils.closeContainer(scm, containerInfo);
+
+    // Checksum file exists after container close and matches the expected 
container
+    // merkle tree
+    for (HddsDatanodeService hddsDatanode : hddsDatanodes) {
+      GenericTestUtils.waitFor(() -> 
waitForContainerChecksumToExist(hddsDatanode, containerInfo), 100, 5000);
+      assertTrue(containerChecksumFileExists(hddsDatanode, containerInfo));
+      ozoneContainer = hddsDatanode.getDatanodeStateMachine().getContainer();
+      container = (KeyValueContainer) ozoneContainer.getController()
+              .getContainer(containerInfo.getContainerID());
+      ContainerProtos.ContainerChecksumInfo containerChecksumInfo = 
ContainerMerkleTreeTestUtils.readChecksumFile(
+              container.getContainerData());
+      
ContainerMerkleTreeTestUtils.assertTreesSortedAndMatch(expectedMerkleTree,
+          containerChecksumInfo.getContainerMerkleTree());
+    }
+  }
+
+  private boolean waitForContainerChecksumToExist(HddsDatanodeService 
hddsDatanode,
+                                                          ContainerInfo 
containerInfo) {
+    try {
+      return 
ContainerMerkleTreeTestUtils.containerChecksumFileExists(hddsDatanode, 
containerInfo);
+    } catch (IOException e) {
+      LOG.error("Checksum file doesn't exist.");

Review Comment:
   This would happen if there is an error when determining if the file exists 
or not. The message should probably be updated to indicate the file's existence 
could not be determined and include the exception.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java:
##########
@@ -598,4 +598,7 @@ public ReplicationServer getReplicationServer() {
     return replicationServer;
   }
 
+  public ContainerChecksumTreeManager getChecksumManager() {
+    return this.checksumTreeManager;
+  }

Review Comment:
   I think this may have slipped in during our merges back and forth. It's no 
longer used.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/TestCloseContainer.java:
##########
@@ -159,11 +183,72 @@ public void testCloseClosedContainer()
     StorageContainerManager scm = cluster.getStorageContainerManager();
     // Pick any container on the cluster and close it via client
     ContainerInfo container = scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, container));
+    }
+    // Close container
     OzoneTestUtils.closeContainer(scm, container);
+
+    // Checksum file exists after container close
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      GenericTestUtils.waitFor(() -> 
waitForContainerChecksumToExist(hddsDatanode, container), 100, 5000);
+      assertTrue(containerChecksumFileExists(hddsDatanode, container));
+    }
+
     assertThrows(IOException.class,
         () -> cluster.getStorageContainerLocationClient()
             .closeContainer(container.getContainerID()),
         "Container " + container.getContainerID() + " already closed");
   }
 
+  @Test
+  public void testContainerChecksumForClosedContainer() throws Exception {
+    // Create some keys to write data into the open containers
+    for (int i = 0; i < 10; i++) {
+      TestDataUtil.createKey(bucket, "key" + i, ReplicationFactor.THREE,
+          ReplicationType.RATIS, "this is the content");
+    }
+    StorageContainerManager scm = cluster.getStorageContainerManager();
+    ContainerInfo containerInfo = 
scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode : hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, containerInfo));
+    }
+
+    // Build expected merkle tree
+    HddsDatanodeService hddsDatanodeService = 
cluster.getHddsDatanodes().get(0);
+    OzoneContainer ozoneContainer = 
hddsDatanodeService.getDatanodeStateMachine().getContainer();
+    KeyValueContainer container = (KeyValueContainer) 
ozoneContainer.getController()
+        .getContainer(containerInfo.getContainerID());
+    ContainerProtos.ContainerMerkleTree expectedMerkleTree = 
buildContainerMerkleTree(
+        container.getContainerData(), cluster.getConf());
+    OzoneTestUtils.closeContainer(scm, containerInfo);

Review Comment:
   This method already waits for the close to finish, so I don't think we need 
to wait again for the checksum file to appear.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -527,6 +529,25 @@ ContainerCommandResponseProto handleCloseContainer(
     return getSuccessResponse(request);
   }
 
+  private void createContainerMerkleTree(KeyValueContainer kvContainer) throws 
IOException {
+    KeyValueContainerData containerData = kvContainer.getContainerData();
+    ContainerMerkleTree merkleTree = new ContainerMerkleTree();
+    try (DBHandle dbHandle = BlockUtils.getDB(containerData, conf);
+         BlockIterator<BlockData> blockIterator = dbHandle.getStore().
+             getBlockIterator(containerData.getContainerID())) {
+      while (blockIterator.hasNext()) {
+        BlockData blockData = blockIterator.nextBlock();
+        List<ContainerProtos.ChunkInfo> chunks = blockData.getChunks();
+        List<ChunkInfo> chunkInfos = new ArrayList<>();
+        for (ContainerProtos.ChunkInfo chunk: chunks) {
+          chunkInfos.add(ChunkInfo.getFromProtoBuf(chunk));
+        }
+        merkleTree.addChunks(blockData.getLocalID(), chunkInfos);

Review Comment:
   We should probably make `ContainerMerkleTree` use 
`ContainerProtos.ChunkInfo` instead of just `ChunkInfo`. Otherwise there is 
unnecessary conversion when generating the tree. The original idea was that you 
could just append chunks as you read them without conversion but I think we 
need to make this change here for it to work in practice.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/hdds/scm/TestCloseContainer.java:
##########
@@ -159,11 +183,72 @@ public void testCloseClosedContainer()
     StorageContainerManager scm = cluster.getStorageContainerManager();
     // Pick any container on the cluster and close it via client
     ContainerInfo container = scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, container));
+    }
+    // Close container
     OzoneTestUtils.closeContainer(scm, container);
+
+    // Checksum file exists after container close
+    for (HddsDatanodeService hddsDatanode: hddsDatanodes) {
+      GenericTestUtils.waitFor(() -> 
waitForContainerChecksumToExist(hddsDatanode, container), 100, 5000);
+      assertTrue(containerChecksumFileExists(hddsDatanode, container));
+    }
+
     assertThrows(IOException.class,
         () -> cluster.getStorageContainerLocationClient()
             .closeContainer(container.getContainerID()),
         "Container " + container.getContainerID() + " already closed");
   }
 
+  @Test
+  public void testContainerChecksumForClosedContainer() throws Exception {
+    // Create some keys to write data into the open containers
+    for (int i = 0; i < 10; i++) {
+      TestDataUtil.createKey(bucket, "key" + i, ReplicationFactor.THREE,
+          ReplicationType.RATIS, "this is the content");
+    }
+    StorageContainerManager scm = cluster.getStorageContainerManager();
+    ContainerInfo containerInfo = 
scm.getContainerManager().getContainers().get(0);
+    // Checksum file doesn't exist before container close
+    List<HddsDatanodeService> hddsDatanodes = cluster.getHddsDatanodes();
+    for (HddsDatanodeService hddsDatanode : hddsDatanodes) {
+      assertFalse(containerChecksumFileExists(hddsDatanode, containerInfo));
+    }
+
+    // Build expected merkle tree
+    HddsDatanodeService hddsDatanodeService = 
cluster.getHddsDatanodes().get(0);
+    OzoneContainer ozoneContainer = 
hddsDatanodeService.getDatanodeStateMachine().getContainer();
+    KeyValueContainer container = (KeyValueContainer) 
ozoneContainer.getController()
+        .getContainer(containerInfo.getContainerID());
+    ContainerProtos.ContainerMerkleTree expectedMerkleTree = 
buildContainerMerkleTree(
+        container.getContainerData(), cluster.getConf());

Review Comment:
   Since the test method is basically a copy/paste of the actual method, I'm 
not sure it's doing much for us here. `TestContainerMerkleTree` already has 
tests on the exact tree constructed. At the integration test level it would 
probably be better to test the trees relative to each other and the container 
itself:
   - Test trees of different replicas on the same container are equal.
   - Test trees of different containers are different.
   - Test block and chunk IDs and counts in trees match those in the containers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to