somandal commented on code in PR #15110:
URL: https://github.com/apache/pinot/pull/15110#discussion_r1968349627


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/instance/InstanceAssignmentDriver.java:
##########
@@ -54,41 +54,44 @@ public InstanceAssignmentDriver(TableConfig tableConfig) {
   }
 
   public InstancePartitions assignInstances(InstancePartitionsType 
instancePartitionsType,
-      List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions) {
+      List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions,

Review Comment:
   recommend adding separate method signatures with the added parameter. Have 
the existing one pass 'forceMinimizeDataMovement' as 'false'
   
   e.g.
   
   ```
   public InstancePartitions assignInstances(InstancePartitionsType 
instancePartitionsType,
         List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions) {
       return assignInstances(instancePartitionsType, instanceConfigs, 
existingInstancePartitions, false);
   }
   ```
   
   That way you only need to modify the Rebalance code to call the newly added 
method signatures. Remaining code that doesn't need to pass this flag can 
continue using the original signature.



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -185,17 +185,18 @@ private RebalanceResult doRebalance(TableConfig 
tableConfig, RebalanceConfig reb
     boolean bestEfforts = rebalanceConfig.isBestEfforts();
     long externalViewCheckIntervalInMs = 
rebalanceConfig.getExternalViewCheckIntervalInMs();
     long externalViewStabilizationTimeoutInMs = 
rebalanceConfig.getExternalViewStabilizationTimeoutInMs();
+    boolean forceMinimizeDataMovement = 
rebalanceConfig.isMinimizeDataMovement();
     boolean enableStrictReplicaGroup = tableConfig.getRoutingConfig() != null
         && 
RoutingConfig.STRICT_REPLICA_GROUP_INSTANCE_SELECTOR_TYPE.equalsIgnoreCase(
         tableConfig.getRoutingConfig().getInstanceSelectorType());
     LOGGER.info(
         "Start rebalancing table: {} with dryRun: {}, preChecks: {}, 
reassignInstances: {}, includeConsuming: {}, "
             + "bootstrap: {}, downtime: {}, minReplicasToKeepUpForNoDowntime: 
{}, enableStrictReplicaGroup: {}, "
             + "lowDiskMode: {}, bestEfforts: {}, 
externalViewCheckIntervalInMs: {}, "
-            + "externalViewStabilizationTimeoutInMs: {}",
+            + "externalViewStabilizationTimeoutInMs: {}, 
forceMinimizeDataMovement: {}",

Review Comment:
   nit: in the log just use `minimizeDataMovement` as that is the config name



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/instance/InstanceAssignmentDriver.java:
##########
@@ -54,41 +54,44 @@ public InstanceAssignmentDriver(TableConfig tableConfig) {
   }
 
   public InstancePartitions assignInstances(InstancePartitionsType 
instancePartitionsType,
-      List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions) {
+      List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions,
+      boolean forceMinimizeDataMovement) {
     String tableNameWithType = _tableConfig.getTableName();
     InstanceAssignmentConfig assignmentConfig =
         
InstanceAssignmentConfigUtils.getInstanceAssignmentConfig(_tableConfig, 
instancePartitionsType);
     return getInstancePartitions(
         
instancePartitionsType.getInstancePartitionsName(TableNameBuilder.extractRawTableName(tableNameWithType)),
-        assignmentConfig, instanceConfigs, existingInstancePartitions, null);
+        assignmentConfig, instanceConfigs, existingInstancePartitions, null, 
forceMinimizeDataMovement);
   }
 
   public InstancePartitions assignInstances(InstancePartitionsType 
instancePartitionsType,
       List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions 
existingInstancePartitions,
-      @Nullable InstancePartitions preConfiguredInstancePartitions) {
+      @Nullable InstancePartitions preConfiguredInstancePartitions, boolean 
forceMinimizeDataMovement) {
     String tableNameWithType = _tableConfig.getTableName();
     InstanceAssignmentConfig assignmentConfig =
         
InstanceAssignmentConfigUtils.getInstanceAssignmentConfig(_tableConfig, 
instancePartitionsType);
     return getInstancePartitions(
         
instancePartitionsType.getInstancePartitionsName(TableNameBuilder.extractRawTableName(tableNameWithType)),
-        assignmentConfig, instanceConfigs, existingInstancePartitions, 
preConfiguredInstancePartitions);
+        assignmentConfig, instanceConfigs, existingInstancePartitions, 
preConfiguredInstancePartitions,
+        forceMinimizeDataMovement);
   }
 
   public InstancePartitions assignInstances(String tierName, 
List<InstanceConfig> instanceConfigs,
-      @Nullable InstancePartitions existingInstancePartitions, 
InstanceAssignmentConfig instanceAssignmentConfig) {
+      @Nullable InstancePartitions existingInstancePartitions, 
InstanceAssignmentConfig instanceAssignmentConfig,
+      boolean forceMinimizeDataMovement) {
     return getInstancePartitions(
         
InstancePartitionsUtils.getInstancePartitionsNameForTier(_tableConfig.getTableName(),
 tierName),
-        instanceAssignmentConfig, instanceConfigs, existingInstancePartitions, 
null);
+        instanceAssignmentConfig, instanceConfigs, existingInstancePartitions, 
null, forceMinimizeDataMovement);
   }
 
   private InstancePartitions getInstancePartitions(String 
instancePartitionsName,
       InstanceAssignmentConfig instanceAssignmentConfig, List<InstanceConfig> 
instanceConfigs,
       @Nullable InstancePartitions existingInstancePartitions,
-      @Nullable InstancePartitions preConfiguredInstancePartitions) {
+      @Nullable InstancePartitions preConfiguredInstancePartitions, boolean 
forceMinimizeDataMovement) {
     String tableNameWithType = _tableConfig.getTableName();
     LOGGER.info("Starting {} instance assignment for table {}", 
instancePartitionsName, tableNameWithType);
 
-    boolean minimizeDataMovement = 
instanceAssignmentConfig.isMinimizeDataMovement();
+    boolean minimizeDataMovement = forceMinimizeDataMovement || 
instanceAssignmentConfig.isMinimizeDataMovement();

Review Comment:
   Also, there are some types of `TableConfigs` for which it is not allowed to 
set an instance config.
   
   There is a method `InstanceAssignmentConfigUtils::allowInstanceAssignment` 
which checks for this. I think it makes sense to only set 
`forceMinimizeDataMovement` to `true` if instance assignment config is even 
allowed. You can perhaps log a warning if `forceMinimizeDataMovement` is set to 
`true` but instance assignment isn't allowed, and just set 
`minimizeDataMovement` to `false` in that case
   
   nit: perhaps also add a comment that explains all of the above
   
   cc @Jackie-Jiang @klsince can you folks confirm if my understanding of the 
above is correct?



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/RebalanceConfig.java:
##########
@@ -83,6 +83,12 @@ public class RebalanceConfig {
   @ApiModelProperty(example = "false")
   private boolean _bestEfforts = false;
 
+  // Whether to enforce Minimal Data Movement Algorithm. If set to false, the 
minimizeDataMovement flag in the table
+  // config will be used to determine whether to run the Minimal Data Movement 
Algorithm.
+  @JsonProperty("minimizeDataMovement")
+  @ApiModelProperty(example = "true")
+  private boolean _minimizeDataMovement = true;

Review Comment:
   Just a thought, should we start with defaulting this to `false`, test it out 
in a cluster with a few different scenarios, and then add a new change to 
default this to `true` once we confirm it always does the right thing?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to