errose28 commented on code in PR #8893:
URL: https://github.com/apache/ozone/pull/8893#discussion_r2277693421
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java:
##########
@@ -1564,6 +1564,19 @@ public void deleteContainer(Container container, boolean
force)
@Override
public void reconcileContainer(DNContainerOperationClient dnClient,
Container<?> container,
Collection<DatanodeDetails> peers) throws IOException {
+ long containerID = container.getContainerData().getContainerID();
+ try {
+ reconcileContainerInternal(dnClient, container, peers);
+ } finally {
+ // Trigger on demand scanner, which will build the merkle tree based on
the newly ingested data.
+ containerSet.scanContainerWithoutGap(containerID,
+ "Container reconciliation");
+ sendICR(container);
Review Comment:
We can leave `sendICR` at the end of `reconcileContainerInternal`. There is
no ordering between these two operations since the scan is async. The initial
ICR gives SCM the potentially updated container checksum after reconciliation
based only on the metadata ingested. The on-demand scan afterwards will send
another one if it finds that the data actually has a different checksum than
the one computed from the metadata.
The current version sends an ICR even if reconcile throws an exception
before making a modification to the container. In this case we should let the
scanner see what's wrong and decide if an ICR is necessary.
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ec/reconstruction/ECReconstructionCoordinator.java:
##########
@@ -189,6 +189,8 @@ public void reconstructECContainerGroup(long containerID,
}
metrics.incReconstructionTotal();
metrics.incBlockGroupReconstructionTotal(blockLocationInfoMap.size());
+ // Trigger a container scan after successful reconstruction
+
context.getParent().getContainer().getContainerSet().scanContainer(containerID,
"EC reconstruction");
Review Comment:
The coordinator is not always creating replicas locally. It can also pull
data from other replicas, create missing chunks, and push those to other nodes.
At the end of reconstruction it calls
[closeContainer](https://github.com/apache/ozone/blob/9553660a9dadefe6accd85a12af84f9bca5fe3cf/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ec/reconstruction/ECReconstructionCoordinator.java#L188)
on any peers that were under reconstruction, so we need to put the scan on the
receiving end of this call.
The flow looks to be that the container goes `RECOVERING -> CLOSING ->
CLOSED`, but the `closing -> closed` transition happens at the same time in
[handleCloseContainer](https://github.com/apache/ozone/blob/9553660a9dadefe6accd85a12af84f9bca5fe3cf/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java#L598).
Since we can't scan `CLOSING` containers, we probably need something like this
in `KeyValueHandler#handleCloseContainer`:
- Save the container's start state at the top of the method
- Let the closing and close calls within the method complete
- Then check if the start we saved was `RECOVERING`, trigger an on-demand
scan on the noew closed container.
This should be unit testable in `TestKeyValueHandler`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]