xyuanlu commented on code in PR #3043:
URL: https://github.com/apache/helix/pull/3043#discussion_r2162121835


##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource, 
final BaseControllerDataPr
     } // end of for-each-partition
   }
 
-  private boolean shouldCreateSTCancellation(Message pendingMessage, String 
desiredState,
+  /**
+   * Calculate the current active replica count based on state model type.
+   * The count includes replicas in top states, secondary top states (where 
applicable),
+   * and ERROR states since helix considers them active.
+   * State model handling:
+   * - Single-top state models: Differentiates between patterns with and 
without secondary top
+   * states
+   * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only

Review Comment:
   This is a bit confusing.  ONLINE-OFFLINE and ONLINE-STANDBY-OFFLINE is 
normally used in multiple top state. 



##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource, 
final BaseControllerDataPr
     } // end of for-each-partition
   }
 
-  private boolean shouldCreateSTCancellation(Message pendingMessage, String 
desiredState,
+  /**
+   * Calculate the current active replica count based on state model type.
+   * The count includes replicas in top states, secondary top states (where 
applicable),
+   * and ERROR states since helix considers them active.
+   * State model handling:
+   * - Single-top state models: Differentiates between patterns with and 
without secondary top
+   * states
+   * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only
+   * - MASTER-SLAVE-OFFLINE: Counts MASTER + SLAVE + ERROR states
+   * - ONLINE-STANDBY-OFFLINE: Counts ONLINE + STANDBY + ERROR states
+   * - Multi-top state models: Counts only top states + ERROR states

Review Comment:
   In multiple top state, with ONLINE-STANDBY-OFFLINE state model, dont we also 
consider standby as active? 



##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -163,6 +165,16 @@ private void generateMessage(final Resource resource, 
final BaseControllerDataPr
       // desired-state->list of generated-messages
       Map<String, List<Message>> messageMap = new HashMap<>();
 
+      /*
+       * Calculate the current active replica count based on state model type.
+       * This represents the number of replicas currently serving traffic for 
this partition
+       * Active replicas include: top states, secondary top states excluding 
OFFLINE (if

Review Comment:
   Secondary top state wont return "ERROR" state. I think we should just 
mention what is counted as "active state" - basically what is defined by state 
model in top state and secondary top state, ERROR state and exclude offline and 
dropped state.



##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource, 
final BaseControllerDataPr
     } // end of for-each-partition
   }
 
-  private boolean shouldCreateSTCancellation(Message pendingMessage, String 
desiredState,
+  /**
+   * Calculate the current active replica count based on state model type.
+   * The count includes replicas in top states, secondary top states (where 
applicable),
+   * and ERROR states since helix considers them active.
+   * State model handling:
+   * - Single-top state models: Differentiates between patterns with and 
without secondary top
+   * states
+   * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only
+   * - MASTER-SLAVE-OFFLINE: Counts MASTER + SLAVE + ERROR states
+   * - ONLINE-STANDBY-OFFLINE: Counts ONLINE + STANDBY + ERROR states
+   * - Multi-top state models: Counts only top states + ERROR states
+   * @param currentStateMap Map of instance name to current state for this 
partition, representing
+   *          the actual state of each replica before any pending transitions
+   * @param stateModelDef State model definition containing the state 
hierarchy and transition rules
+   *          used to determine which states are considered active
+   * @return The number of replicas currently in active states, used to 
determine the
+   *         currentActiveReplicaNumber for the partition.
+   */
+  private int calculateCurrentActiveReplicaCount(Map<String, String> 
currentStateMap,
+      StateModelDefinition stateModelDef) {
+    List<String> activeSecondaryTopStates = 
getActiveSecondaryTopStates(stateModelDef);
+    return (int) currentStateMap.values().stream()
+        .filter(state -> stateModelDef.getTopState().contains(state) // Top 
states (MASTER, ONLINE,
+                                                                     // LEADER)
+            || activeSecondaryTopStates.contains(state) // Active secondary 
states (SLAVE, STANDBY,
+                                                      // BOOTSTRAP)
+            || HelixDefinedState.ERROR.name().equals(state) // ERROR states 
(still considered
+                                                            // active)
+        // DROPPED and OFFLINE are automatically excluded by 
getActiveSecondaryTopStates()
+        ).count();
+  }
+
+  /**
+   * Get active secondary top states - states that are not non-serving states 
like OFFLINE and DROPPED.
+   * Reasons for elimination:
+   * - getSecondTopStates() can include OFFLINE as a secondary top state in 
some state models.
+   * Example - OnlineOffline:
+   * getSecondTopStates() = ["OFFLINE"] as it transitions to ONLINE.
+   * After filtering: activeSecondaryTopStates=[] (removes "OFFLINE" as it's 
not a serving state).
+   * @param stateModelDef State model definition containing state hierarchy 
information
+   */
+  private List<String> getActiveSecondaryTopStates(StateModelDefinition 
stateModelDef) {
+    return stateModelDef.getSecondTopStates().stream()
+        // Remove non-serving states
+        .filter(state -> !OFFLINE.equals(state) && 
!HelixDefinedState.DROPPED.name().equals(state))
+        .collect(Collectors.toList());
+  }
+
+  /**
+   * Determines if the given state is considered active based on the state 
model type.
+   * Active states are defined as:
+   * - For single-top state models: top states, active secondary top states, 
and ERROR states
+   * - For multi-top state models: top states and ERROR states
+   * ERROR state replicas are always considered active in HELIX as they do not 
affect
+   * availability.
+   * @param state The state to check (can be current state or target state)
+   * @param stateModelDef State model definition containing state hierarchy 
information
+   * @return true if the state is considered active, false otherwise
+   */
+  private boolean isStateActive(String state, StateModelDefinition 
stateModelDef) {
+    // ERROR state is always considered active regardless of state model type
+    if (HelixDefinedState.ERROR.name().equals(state)) {
+      return true;
+    }
+
+    if (stateModelDef.isSingleTopStateModel()) {
+      // For single-top models, both primary top states and active secondary 
states are considered active
+      return stateModelDef.getTopState().contains(state)
+          || getActiveSecondaryTopStates(stateModelDef).contains(state);
+    } else {
+      // For multi-top models, only top states are considered active
+      return stateModelDef.getTopState().contains(state);

Review Comment:
   Why for multi-top models, only top states are considered active?
    with ONLINE-STANDBY-OFFLINE state model, dont we also consider standby as 
active?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to