sodonnel commented on code in PR #4694:
URL: https://github.com/apache/ozone/pull/4694#discussion_r1190244821


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECUnderReplicationHandler.java:
##########
@@ -124,19 +124,15 @@ public int processAndSendCommands(
       return 0;
     }
 
-    // don't place reconstructed replicas on exclude nodes, since they already
-    // have replicas
-    List<DatanodeDetails> excludedNodes = replicas.stream()
-        .map(ContainerReplica::getDatanodeDetails)
-        .collect(Collectors.toList());
-    // DNs that are already waiting to receive replicas cannot be targets
-    excludedNodes.addAll(
-        pendingOps.stream()
-        .filter(containerReplicaOp -> containerReplicaOp.getOpType() ==
-            ContainerReplicaOp.PendingOpType.ADD)
-        .map(ContainerReplicaOp::getTarget)
-        .collect(Collectors.toList()));
+    ReplicationManagerUtil.ExcludedAndUsedNodes excludedAndUsedNodes =
+        ReplicationManagerUtil.getExcludedAndUsedNodes(
+            new ArrayList<>(replicas), Collections.emptySet(), pendingOps,
+            replicationManager);
+    List<DatanodeDetails> excludedNodes
+        = excludedAndUsedNodes.getExcludedNodes();
     excludedNodes.addAll(replicationManager.getExcludedNodes());

Review Comment:
   The reason we only add `replicationManager.getExcludedNodes()` to the 
excludeList in the EC handler is because of the way the command targets work.
   
   For replication commands, we have the sources, and then select the least 
loaded one to act as the source for the push replication. If they are all 
overloaded we give an error, and we don't consider load on the target right 
now. Hence we don't try to exclude any overloaded nodes as targets, as all our 
load limiting is based on the number of queued commands. For replication we are 
push, so we don't have a limit on the target.
   
   For reconstruction, we have to select target nodes, and one of those targets 
must be the place the command is sent to. Therefore we use the exclude list to 
avoid selecting already overloaded command targets. However, if there is 
decommission or maintenance too, the same exclude list applies to all of the EC 
handler, so decommission and maintenance replicate commands would have their 
targets restricted by this exclude list too.
   
   It is a valid point - should we avoid using "overloaded with too many 
commands" as a target for replication? It would be simple to do it, as we could 
just move that one line of code into that new method. It would also make it 
consistent with how the EC decommission and maintenance behave. Need to think 
about it a bit more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to