MarkGaox commented on code in PR #2886: URL: https://github.com/apache/helix/pull/2886#discussion_r1727200186
########## helix-rest/src/main/java/org/apache/helix/rest/clusterMaintenanceService/HealthCheck.java: ########## @@ -54,21 +55,24 @@ public enum HealthCheck { /** * Check if instance has error partitions */ - HAS_ERROR_PARTITION, - /** - * Check if all resources hosted on the instance can still meet the min active replica - * constraint if this instance is shutdown - */ - MIN_ACTIVE_REPLICA_CHECK_FAILED; Review Comment: This would break backward compatibility if the user depends the HealthCheck Enum. How about this? let's keep it there. But exclude enum MIN_ACTIVE_REPLICA_CHECK_FAILED before calling `getInstanceHealthStatus()` inside the method `performHelixOwnInstanceCheck()` ``` List<HealthCheck> healthChecksToExecute = new ArrayList<>(HealthCheck.STOPPABLE_CHECK_LIST); healthChecksToExecute.removeAll(_skipStoppableHealthCheckList); healthChecksToExecute.remove(MIN_ACTIVE_REPLICA_CHECK_FAILED) ``` ########## helix-rest/src/main/java/org/apache/helix/rest/clusterMaintenanceService/MaintenanceManagementService.java: ########## @@ -472,9 +472,15 @@ private MaintenanceManagementInstanceInfo takeFreeSingleInstanceHelper(String cl private List<String> batchHelixInstanceStoppableCheck(String clusterId, Collection<String> instances, Map<String, StoppableCheck> finalStoppableChecks, Set<String> toBeStoppedInstances) { + + // Perform all but min_active replicas check in parallel Map<String, Future<StoppableCheck>> helixInstanceChecks = instances.stream().collect( Collectors.toMap(Function.identity(), instance -> POOL.submit( () -> performHelixOwnInstanceCheck(clusterId, instance, toBeStoppedInstances)))); + + // Perform min_active replicas check sequentially + addInstanceMinActiveReplicaCheck(helixInstanceChecks, toBeStoppedInstances); Review Comment: 1. An easier approach that eliminates the need to check futureStoppableCheckByInstance is to perform the MIN_ACTIVE_REPLICA_CHECK_FAILED first. This way, there is no need to pass helixInstanceChecks as a parameter. 2. On the other hand, be sure you check if the MIN_ACTIVE_REPLICA_CHECK_FAILED check is in the `_skipStoppableHealthCheckList`. If it is, the MIN_ACTIVE_REPLICA_CHECK_FAILED check should not be performed. ########## helix-rest/src/main/java/org/apache/helix/rest/clusterMaintenanceService/MaintenanceManagementService.java: ########## @@ -800,6 +802,35 @@ protected Map<String, Boolean> getInstanceHealthStatus(String clusterId, String return healthStatus; } + // Adds the result of the min_active replica check for each stoppable check passed in futureStoppableCheckByInstance + private void addInstanceMinActiveReplicaCheck(Map<String, Future<StoppableCheck>> futureStoppableCheckByInstance, + Set<String> toBeStoppedInstances) { + Set<String> possibleToStopInstances = new HashSet<>(toBeStoppedInstances); + + for (Map.Entry<String, Future<StoppableCheck>> entry : futureStoppableCheckByInstance.entrySet()) { + try { + String instanceName = entry.getKey(); + StoppableCheck stoppableCheck = entry.getValue().get(); Review Comment: Same as above, by placing the min active replica check before submitting the asynchronous checks, there is no longer a need to check the future and get the value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@helix.apache.org For additional commands, e-mail: reviews-h...@helix.apache.org