errose28 commented on code in PR #8893:
URL: https://github.com/apache/ozone/pull/8893#discussion_r2277693421


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1564,6 +1564,19 @@ public void deleteContainer(Container container, boolean 
force)
   @Override
   public void reconcileContainer(DNContainerOperationClient dnClient, 
Container<?> container,
       Collection<DatanodeDetails> peers) throws IOException {
+    long containerID = container.getContainerData().getContainerID();
+    try {
+      reconcileContainerInternal(dnClient, container, peers);
+    } finally {
+      // Trigger on demand scanner, which will build the merkle tree based on 
the newly ingested data.
+      containerSet.scanContainerWithoutGap(containerID,
+          "Container reconciliation");
+      sendICR(container);

Review Comment:
   We can leave `sendICR` at the end of `reconcileContainerInternal`. There is 
no ordering between these two operations since the scan is async. The initial 
ICR gives SCM the potentially updated container checksum after reconciliation 
based only on the metadata ingested. The on-demand scan afterwards will send 
another one if it finds that the data actually has a different checksum than 
the one computed from the metadata.
   
   The current version sends an ICR even if reconcile throws an exception 
before making a modification to the container. In this case we should let the 
scanner see what's wrong and decide if an ICR is necessary.



##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ec/reconstruction/ECReconstructionCoordinator.java:
##########
@@ -189,6 +189,8 @@ public void reconstructECContainerGroup(long containerID,
       }
       metrics.incReconstructionTotal();
       metrics.incBlockGroupReconstructionTotal(blockLocationInfoMap.size());
+      // Trigger a container scan after successful reconstruction
+      
context.getParent().getContainer().getContainerSet().scanContainer(containerID, 
"EC reconstruction");

Review Comment:
   The coordinator is not always creating replicas locally. It can also pull 
data from other replicas, create missing chunks, and push those to other nodes. 
At the end of reconstruction it calls 
[closeContainer](https://github.com/apache/ozone/blob/9553660a9dadefe6accd85a12af84f9bca5fe3cf/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ec/reconstruction/ECReconstructionCoordinator.java#L188)
 on any peers that were under reconstruction, so we need to put the scan on the 
receiving end of this call.
   
   The flow looks to be that the container goes `RECOVERING -> CLOSING -> 
CLOSED`, but the `closing -> closed` transition happens at the same time in 
[handleCloseContainer](https://github.com/apache/ozone/blob/9553660a9dadefe6accd85a12af84f9bca5fe3cf/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java#L598).
 Since we can't scan `CLOSING` containers, we probably need something like this 
in `KeyValueHandler#handleCloseContainer`:
   - Save the container's start state at the top of the method
   - Let the closing and close calls within the method complete
   - Then check if the start we saved was `RECOVERING`, trigger an on-demand 
scan on the noew closed container.
   
   This should be unit testable in `TestKeyValueHandler`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to