MarkGaox commented on code in PR #2687:
URL: https://github.com/apache/helix/pull/2687#discussion_r1379449091
##########
helix-core/src/main/java/org/apache/helix/util/InstanceValidationUtil.java:
##########
@@ -451,9 +474,11 @@ public static boolean
siblingNodesActiveReplicaCheck(HelixDataAccessor dataAcces
if (stateByInstanceMap.containsKey(instanceName)) {
int numHealthySiblings = 0;
for (Map.Entry<String, String> entry :
stateByInstanceMap.entrySet()) {
- if (!entry.getKey().equals(instanceName) && (toBeStoppedInstances
== null
- || !toBeStoppedInstances.contains(entry.getKey())) &&
!unhealthyStates.contains(
- entry.getValue())) {
+ String siblingInstanceName = entry.getKey();
+ if (!siblingInstanceName.equals(instanceName) &&
(toBeStoppedInstances == null
+ || !toBeStoppedInstances.contains(siblingInstanceName))
+ && !unhealthyStates.contains(entry.getValue()) &&
!isOperationSetForInstance(
+ dataAccessor, siblingInstanceName)) {
Review Comment:
Sounds good. Will delegate the evacuating node filtering to
`StoppablInstancesSelector`, in which it would go through all instances in the
topology and add evacuating nodes to the `toBeStoppedInstances`
##########
helix-rest/src/main/java/org/apache/helix/rest/clusterMaintenanceService/MaintenanceManagementService.java:
##########
@@ -144,6 +148,26 @@ public MaintenanceManagementService(ZKHelixDataAccessor
dataAccessor,
_namespace = namespace;
}
+ @VisibleForTesting
+ MaintenanceManagementService(MaintenanceManagementServiceBuilder builder) {
Review Comment:
Fixed
##########
helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstancesAccessor.java:
##########
@@ -260,10 +271,44 @@ private Response batchGetStoppableInstances(String
clusterId, JsonNode node, boo
}
}
+ if (node.get(InstancesProperties.stoppable_check_list.name()) != null) {
+ List<String> list = OBJECT_MAPPER.readValue(
+
node.get(InstancesProperties.stoppable_check_list.name()).toString(),
+ OBJECT_MAPPER.getTypeFactory().constructCollectionType(List.class,
String.class));
+
+ if (ALL_STOPPABLE_CHECK_LIST.equals(list)) {
+ stoppableCheckList = HealthCheck.STOPPABLE_CHECK_LIST;
+ } else {
+ try {
+ stoppableCheckList =
+
list.stream().map(HealthCheck::valueOf).collect(Collectors.toList());
+ } catch (IllegalArgumentException e) {
+ String message =
+ "'stoppable_check_list' has invalid check names: " + list
+ + ". Supported checks: " +
HealthCheck.STOPPABLE_CHECK_LIST;
+ _logger.error(message, e);
+ return badRequest(message);
+ }
+ }
+ } else {
+ // By default, if stoppable_check_list is unset, all checks are
performed to maintain
+ // backward compatibility with existing clients.
+ stoppableCheckList = HealthCheck.STOPPABLE_CHECK_LIST;
+ }
Review Comment:
Logic is cleaned up as follow
- If the `STOPPABLE_CHECK_LIST` is not provided, Helix will perform **all
Helix own stoppable checks** by default. And this will help with the backward
compatibility since all existing users would expect Helix to perform all checks.
- If the `STOPPABLE_CHECK_LIST` is provided,
- If it's empty, Helix will skip all Helix own checks.
- If it's not empty, Helix will only perform the given checks and carry
out the checks in the order they are listed.
One more thing to note. In the current code, the HELIX:INSTANCE_NOT_EXIST
will always be performed because I think it doesn't make sense to mark
instances as stoppable if they don't even exist.
Thanks for bringing this issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]