xyuanlu commented on code in PR #3043:
URL: https://github.com/apache/helix/pull/3043#discussion_r2162121835
##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource,
final BaseControllerDataPr
} // end of for-each-partition
}
- private boolean shouldCreateSTCancellation(Message pendingMessage, String
desiredState,
+ /**
+ * Calculate the current active replica count based on state model type.
+ * The count includes replicas in top states, secondary top states (where
applicable),
+ * and ERROR states since helix considers them active.
+ * State model handling:
+ * - Single-top state models: Differentiates between patterns with and
without secondary top
+ * states
+ * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only
Review Comment:
This is a bit confusing. ONLINE-OFFLINE and ONLINE-STANDBY-OFFLINE is
normally used in multiple top state.
##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource,
final BaseControllerDataPr
} // end of for-each-partition
}
- private boolean shouldCreateSTCancellation(Message pendingMessage, String
desiredState,
+ /**
+ * Calculate the current active replica count based on state model type.
+ * The count includes replicas in top states, secondary top states (where
applicable),
+ * and ERROR states since helix considers them active.
+ * State model handling:
+ * - Single-top state models: Differentiates between patterns with and
without secondary top
+ * states
+ * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only
+ * - MASTER-SLAVE-OFFLINE: Counts MASTER + SLAVE + ERROR states
+ * - ONLINE-STANDBY-OFFLINE: Counts ONLINE + STANDBY + ERROR states
+ * - Multi-top state models: Counts only top states + ERROR states
Review Comment:
In multiple top state, with ONLINE-STANDBY-OFFLINE state model, dont we also
consider standby as active?
##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -163,6 +165,16 @@ private void generateMessage(final Resource resource,
final BaseControllerDataPr
// desired-state->list of generated-messages
Map<String, List<Message>> messageMap = new HashMap<>();
+ /*
+ * Calculate the current active replica count based on state model type.
+ * This represents the number of replicas currently serving traffic for
this partition
+ * Active replicas include: top states, secondary top states excluding
OFFLINE (if
Review Comment:
Secondary top state wont return "ERROR" state. I think we should just
mention what is counted as "active state" - basically what is defined by state
model in top state and secondary top state, ERROR state and exclude offline and
dropped state.
##########
helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java:
##########
@@ -290,7 +325,82 @@ private void generateMessage(final Resource resource,
final BaseControllerDataPr
} // end of for-each-partition
}
- private boolean shouldCreateSTCancellation(Message pendingMessage, String
desiredState,
+ /**
+ * Calculate the current active replica count based on state model type.
+ * The count includes replicas in top states, secondary top states (where
applicable),
+ * and ERROR states since helix considers them active.
+ * State model handling:
+ * - Single-top state models: Differentiates between patterns with and
without secondary top
+ * states
+ * - ONLINE-OFFLINE: Counts ONLINE + ERROR states only
+ * - MASTER-SLAVE-OFFLINE: Counts MASTER + SLAVE + ERROR states
+ * - ONLINE-STANDBY-OFFLINE: Counts ONLINE + STANDBY + ERROR states
+ * - Multi-top state models: Counts only top states + ERROR states
+ * @param currentStateMap Map of instance name to current state for this
partition, representing
+ * the actual state of each replica before any pending transitions
+ * @param stateModelDef State model definition containing the state
hierarchy and transition rules
+ * used to determine which states are considered active
+ * @return The number of replicas currently in active states, used to
determine the
+ * currentActiveReplicaNumber for the partition.
+ */
+ private int calculateCurrentActiveReplicaCount(Map<String, String>
currentStateMap,
+ StateModelDefinition stateModelDef) {
+ List<String> activeSecondaryTopStates =
getActiveSecondaryTopStates(stateModelDef);
+ return (int) currentStateMap.values().stream()
+ .filter(state -> stateModelDef.getTopState().contains(state) // Top
states (MASTER, ONLINE,
+ // LEADER)
+ || activeSecondaryTopStates.contains(state) // Active secondary
states (SLAVE, STANDBY,
+ // BOOTSTRAP)
+ || HelixDefinedState.ERROR.name().equals(state) // ERROR states
(still considered
+ // active)
+ // DROPPED and OFFLINE are automatically excluded by
getActiveSecondaryTopStates()
+ ).count();
+ }
+
+ /**
+ * Get active secondary top states - states that are not non-serving states
like OFFLINE and DROPPED.
+ * Reasons for elimination:
+ * - getSecondTopStates() can include OFFLINE as a secondary top state in
some state models.
+ * Example - OnlineOffline:
+ * getSecondTopStates() = ["OFFLINE"] as it transitions to ONLINE.
+ * After filtering: activeSecondaryTopStates=[] (removes "OFFLINE" as it's
not a serving state).
+ * @param stateModelDef State model definition containing state hierarchy
information
+ */
+ private List<String> getActiveSecondaryTopStates(StateModelDefinition
stateModelDef) {
+ return stateModelDef.getSecondTopStates().stream()
+ // Remove non-serving states
+ .filter(state -> !OFFLINE.equals(state) &&
!HelixDefinedState.DROPPED.name().equals(state))
+ .collect(Collectors.toList());
+ }
+
+ /**
+ * Determines if the given state is considered active based on the state
model type.
+ * Active states are defined as:
+ * - For single-top state models: top states, active secondary top states,
and ERROR states
+ * - For multi-top state models: top states and ERROR states
+ * ERROR state replicas are always considered active in HELIX as they do not
affect
+ * availability.
+ * @param state The state to check (can be current state or target state)
+ * @param stateModelDef State model definition containing state hierarchy
information
+ * @return true if the state is considered active, false otherwise
+ */
+ private boolean isStateActive(String state, StateModelDefinition
stateModelDef) {
+ // ERROR state is always considered active regardless of state model type
+ if (HelixDefinedState.ERROR.name().equals(state)) {
+ return true;
+ }
+
+ if (stateModelDef.isSingleTopStateModel()) {
+ // For single-top models, both primary top states and active secondary
states are considered active
+ return stateModelDef.getTopState().contains(state)
+ || getActiveSecondaryTopStates(stateModelDef).contains(state);
+ } else {
+ // For multi-top models, only top states are considered active
+ return stateModelDef.getTopState().contains(state);
Review Comment:
Why for multi-top models, only top states are considered active?
with ONLINE-STANDBY-OFFLINE state model, dont we also consider standby as
active?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]