smengcl commented on a change in pull request #2903:
URL: https://github.com/apache/ozone/pull/2903#discussion_r766455016



##########
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/SimpleContainerDownloader.java
##########
@@ -74,38 +74,28 @@ public SimpleContainerDownloader(
   public CompletableFuture<Path> getContainerDataFromReplicas(
       long containerId, List<DatanodeDetails> sourceDatanodes) {
 
-    CompletableFuture<Path> result = null;
-
     final List<DatanodeDetails> shuffledDatanodes =
         shuffleDatanodes(sourceDatanodes);
 
     for (DatanodeDetails datanode : shuffledDatanodes) {
       try {
-        if (result == null) {
-          result = downloadContainer(containerId, datanode);
-        } else {
-
-          result = result.exceptionally(t -> {
-            LOG.error("Error on replicating container: " + containerId, t);
-            try {
-              return downloadContainer(containerId, datanode).get();
-            } catch (ExecutionException | IOException e) {
-              LOG.error("Error on replicating container: " + containerId,
-                  e);
-            } catch (InterruptedException e) {
-              Thread.currentThread().interrupt();
-            }
-            return null;
-          });

Review comment:
       > On the first pass of the loop, result will be null so the download is 
attempted via the future. However we don't check for completion, so the loop 
must go into a second pass.
   
   1. Yes I tried to follow the execution in 
`testGetContainerDataFromReplicasHappyPath` with the original code. The test 
case is supposed to just return the first datanode (because it's the happy 
path), but it continued the loop two more times (for the rest of the two 
datanodes) doing nothing.
   
   > On the second pass (assuming there is more than 1 DN). Result will be 
non-null, so it will call result.exceptionally(). Reading the java doc for it, 
which is confusing to say the least, I think it will "complete" the future if 
it was successful, or run the lambda if not. At this stage, the lambda is using 
the second DN in the list rather than the first, so the retry will be on the 
second DN, not a second try on the first.
   
   2. For the UT, the lambda if only executed in 
`testGetContainerDataFromReplicasAsyncFailure`, where `result` is not null, but 
the future "Completed **exceptionally**" -- This should mean that the future 
has thrown exception asynchronously, like here for the test case:
   
   
https://github.com/apache/ozone/blob/656339e6ad286fd816f330d892ce29ff26ef1b75/hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/replication/TestSimpleContainerDownloader.java#L179
   
   > If there is a 3rd DN in the list, I think it will still loop again, as 
there is no break in the loop and no return (aside from the one in the lambda), 
but this time result is non-null and completed if the second pass worked, so I 
am guessing calling exceptionally on it does nothing, but I am not certain.
   
   3. Yes it should do nothing if the future completed successfully. From my 
observation in (2), the lambda won't be executed unless the future "Completed 
**exceptionally**".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to