bipinprasad commented on a change in pull request #3215: Storm3585 - New 
compact Constraint config including maxNodeCoLocationCnt and 
incompatibleComponents
URL: https://github.com/apache/storm/pull/3215#discussion_r389956774
 
 

 ##########
 File path: 
storm-server/src/main/java/org/apache/storm/scheduler/resource/strategies/scheduling/ConstraintSolverStrategy.java
 ##########
 @@ -593,13 +710,26 @@ public ExecutorDetails currentExec() {
             return execs.get(execIndex);
         }
 
+        /**
+          * Assign executor to worker and node.
+          * TODO: tryToSchedule is a misnomer, since it always schedules.
+          * Assignment validity check is done before the call to 
tryToSchedule().
+          *
+          * @param execToComp Mapping from executor to component name.
+          * @param node RasNode on which to schedule.
+          * @param workerSlot WorkerSlot on which to schedule.
+          */
         public void tryToSchedule(Map<ExecutorDetails, String> execToComp, 
RasNode node, WorkerSlot workerSlot) {
             ExecutorDetails exec = currentExec();
             String comp = execToComp.get(exec);
             LOG.trace("Trying assignment of {} {} to {}", exec, comp, 
workerSlot);
-            //It is possible that this component is already scheduled on this 
node or worker.  If so when we backtrack we cannot remove it
-            okToRemoveFromWorker[execIndex] = 
workerCompAssignment.computeIfAbsent(workerSlot, (k) -> new 
HashSet<>()).add(comp);
-            okToRemoveFromNode[execIndex] = 
nodeCompAssignment.computeIfAbsent(node, (k) -> new HashSet<>()).add(comp);
+            // It is possible that this component is already scheduled on this 
node or worker.  If so when we backtrack we cannot remove it
+            Map<String, Integer> oneMap = 
workerCompAssignmentCnts.computeIfAbsent(workerSlot, (k) -> new HashMap<>());
+            oneMap.put(comp, oneMap.getOrDefault(comp, 0) + 1); // increment 
assignment count
+            okToRemoveFromWorker[execIndex] = true;
 
 Review comment:
   Current (and prior) code always checks before assignment. So let's consider 
that case only for simplicity - that the executor can be assigned to the 
selected worker and node.
   
   If the executor is scheduled in tryToSchedule() then the okToRemoveFromNode 
flag should be flipped to true, so that when backtracking, the component (or 
component count) is properly decremented).
   
   Earlier code would not not remove a component once added (because the flag 
would only be flipped to true once and any subsequent  assignment backtracking 
would not remove the component - leading to inconsistent understanding of the 
current component distribution (and constraint evaluation). So the error will 
manifest when backtracking.
   
   Let's say exec1 and exec2 are both assigning the same component to the same 
worker/node and backtracking happens for exec2 assignment. is exec2 component 
is not properly decremented out of node1.
   
   The assignment (nodeCompAssignment) is a shared variable across executors. 
So there is no indicator which executor assigned. This flag is only retained in 
the okToRemove. 
   
   Hope this makes sense - unless I am confused!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to