smengcl commented on a change in pull request #2903:
URL: https://github.com/apache/ozone/pull/2903#discussion_r766455016
##########
File path:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/SimpleContainerDownloader.java
##########
@@ -74,38 +74,28 @@ public SimpleContainerDownloader(
public CompletableFuture<Path> getContainerDataFromReplicas(
long containerId, List<DatanodeDetails> sourceDatanodes) {
- CompletableFuture<Path> result = null;
-
final List<DatanodeDetails> shuffledDatanodes =
shuffleDatanodes(sourceDatanodes);
for (DatanodeDetails datanode : shuffledDatanodes) {
try {
- if (result == null) {
- result = downloadContainer(containerId, datanode);
- } else {
-
- result = result.exceptionally(t -> {
- LOG.error("Error on replicating container: " + containerId, t);
- try {
- return downloadContainer(containerId, datanode).get();
- } catch (ExecutionException | IOException e) {
- LOG.error("Error on replicating container: " + containerId,
- e);
- } catch (InterruptedException e) {
- Thread.currentThread().interrupt();
- }
- return null;
- });
Review comment:
> On the first pass of the loop, result will be null so the download is
attempted via the future. However we don't check for completion, so the loop
must go into a second pass.
1. Yes I tried to follow the execution in
`testGetContainerDataFromReplicasHappyPath` with the original code. The test
case is supposed to just return the first datanode (because it's the happy
path), but it continued the loop two more times (for the rest of the two
datanodes) doing nothing.
> On the second pass (assuming there is more than 1 DN). Result will be
non-null, so it will call result.exceptionally(). Reading the java doc for it,
which is confusing to say the least, I think it will "complete" the future if
it was successful, or run the lambda if not. At this stage, the lambda is using
the second DN in the list rather than the first, so the retry will be on the
second DN, not a second try on the first.
2. For the UT, the lambda is only executed in
`testGetContainerDataFromReplicasAsyncFailure`, where `result` is not null, but
the future `Completed **exceptionally**` -- This should mean that the future
has thrown exception asynchronously, like here for this unit test:
https://github.com/apache/ozone/blob/656339e6ad286fd816f330d892ce29ff26ef1b75/hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/replication/TestSimpleContainerDownloader.java#L179
> If there is a 3rd DN in the list, I think it will still loop again, as
there is no break in the loop and no return (aside from the one in the lambda),
but this time result is non-null and completed if the second pass worked, so I
am guessing calling exceptionally on it does nothing, but I am not certain.
3. Yes it should do nothing if the future completed successfully. From my
observation in (2), the lambda won't be executed unless the future "Completed
**exceptionally**".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]