jiajunwang commented on a change in pull request #639: Refine the WAGED
rebalancer to minimize the partial rebalance workload.
URL: https://github.com/apache/helix/pull/639#discussion_r352878366
##########
File path:
helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/model/ClusterModelProvider.java
##########
@@ -43,7 +45,83 @@
*/
public class ClusterModelProvider {
+ private enum RebalanceScopeType {
+ // Set the rebalance scope to cover the difference between the current
assignment and the
+ // Baseline assignment only.
+ PARTIAL,
+ // Set the rebalance scope to cover all replicas that need relocation
based on the cluster
+ // changes.
+ GLOBAL
+ }
+
+ /**
+ * Generate a new Cluster Model object according to the current cluster
status for partial
+ * rebalance. The rebalance scope is configured for recovering the missing
replicas only.
+ * @param dataProvider The controller's data cache.
+ * @param resourceMap The full list of the resources to be
rebalanced. Note that any
+ * resources that are not in this list will be
removed from the
+ * final assignment.
+ * @param activeInstances The active instances that will be used in
the calculation.
+ * Note this list can be different from the
real active node list
+ * according to the rebalancer logic.
+ * @param baselineAssignment The persisted Baseline assignment.
+ * @param bestPossibleAssignment The persisted Best Possible assignment that
was generated in the
+ * previous rebalance.
+ * @return
+ */
+ public static ClusterModel generateClusterModelForPartialRebalance(
+ ResourceControllerDataProvider dataProvider, Map<String, Resource>
resourceMap,
+ Set<String> activeInstances, Map<String, ResourceAssignment>
baselineAssignment,
+ Map<String, ResourceAssignment> bestPossibleAssignment) {
+ return generateClusterModel(dataProvider, resourceMap, activeInstances,
Collections.emptyMap(),
+ baselineAssignment, bestPossibleAssignment,
RebalanceScopeType.PARTIAL);
+ }
+
+ /**
+ * Generate a new Cluster Model object according to the current cluster
status for the Baseline
+ * calculation. The rebalance scope is determined according to the cluster
changes.
+ * @param dataProvider The controller's data cache.
+ * @param resourceMap The full list of the resources to be
rebalanced. Note that any
+ * resources that are not in this list will be
removed from the
+ * final assignment.
+ * @param activeInstances The active instances that will be used in
the calculation.
+ * Note this list can be different from the
real active node list
+ * according to the rebalancer logic.
+ * @param clusterChanges All the cluster changes that happened after
the previous rebalance.
+ * @param baselineAssignment The persisted Baseline assignment.
+ * @param bestPossibleAssignment The persisted Best Possible assignment that
was generated in the
+ * previous rebalance.
+ * @return the new cluster model
+ */
+ public static ClusterModel generateClusterModelForBaseline(
+ ResourceControllerDataProvider dataProvider, Map<String, Resource>
resourceMap,
+ Set<String> activeInstances, Map<HelixConstants.ChangeType, Set<String>>
clusterChanges,
+ Map<String, ResourceAssignment> baselineAssignment,
+ Map<String, ResourceAssignment> bestPossibleAssignment) {
+ return generateClusterModel(dataProvider, resourceMap, activeInstances,
clusterChanges,
+ baselineAssignment, bestPossibleAssignment, RebalanceScopeType.GLOBAL);
+ }
+
+ /**
+ * Generate a cluster model based on the current state output and data
cache. The rebalance scope
+ * is configured for recovering the missing replicas only.
+ * @param dataProvider The controller's data cache.
+ * @param resourceMap The full list of the resources to be
rebalanced. Note that any
+ * resources that are not in this list will be
removed from the
+ * final assignment.
+ * @param existingAssignment The resource assignment built from current
state output.
+ * @return the new cluster model
+ */
+ public static ClusterModel generateClusterModelFromExistingAssignment(
+ ResourceControllerDataProvider dataProvider, Map<String, Resource>
resourceMap,
+ Map<String, ResourceAssignment> existingAssignment) {
+ return generateClusterModel(dataProvider, resourceMap,
dataProvider.getEnabledLiveInstances(),
+ Collections.emptyMap(), Collections.emptyMap(), existingAssignment,
+ RebalanceScopeType.GLOBAL);
Review comment:
Here's the tricky part. There are only 2 sets of computing logic here. But
we have 3 methods...
The key difference is whether we ignore the unknown replica or not. Let me
update the generateClusterModelForPartialRebalance description.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]