somandal commented on code in PR #15110:
URL: https://github.com/apache/pinot/pull/15110#discussion_r1968349627
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/instance/InstanceAssignmentDriver.java:
##########
@@ -54,41 +54,44 @@ public InstanceAssignmentDriver(TableConfig tableConfig) {
}
public InstancePartitions assignInstances(InstancePartitionsType
instancePartitionsType,
- List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions) {
+ List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions,
Review Comment:
recommend adding separate method signatures with the added parameter. Have
the existing one pass 'forceMinimizeDataMovement' as 'false'
e.g.
```
public InstancePartitions assignInstances(InstancePartitionsType
instancePartitionsType,
List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions) {
return assignInstances(instancePartitionsType, instanceConfigs,
existingInstancePartitions, false);
}
```
That way you only need to modify the Rebalance code to call the newly added
method signatures. Remaining code that doesn't need to pass this flag can
continue using the original signature.
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/TableRebalancer.java:
##########
@@ -185,17 +185,18 @@ private RebalanceResult doRebalance(TableConfig
tableConfig, RebalanceConfig reb
boolean bestEfforts = rebalanceConfig.isBestEfforts();
long externalViewCheckIntervalInMs =
rebalanceConfig.getExternalViewCheckIntervalInMs();
long externalViewStabilizationTimeoutInMs =
rebalanceConfig.getExternalViewStabilizationTimeoutInMs();
+ boolean forceMinimizeDataMovement =
rebalanceConfig.isMinimizeDataMovement();
boolean enableStrictReplicaGroup = tableConfig.getRoutingConfig() != null
&&
RoutingConfig.STRICT_REPLICA_GROUP_INSTANCE_SELECTOR_TYPE.equalsIgnoreCase(
tableConfig.getRoutingConfig().getInstanceSelectorType());
LOGGER.info(
"Start rebalancing table: {} with dryRun: {}, preChecks: {},
reassignInstances: {}, includeConsuming: {}, "
+ "bootstrap: {}, downtime: {}, minReplicasToKeepUpForNoDowntime:
{}, enableStrictReplicaGroup: {}, "
+ "lowDiskMode: {}, bestEfforts: {},
externalViewCheckIntervalInMs: {}, "
- + "externalViewStabilizationTimeoutInMs: {}",
+ + "externalViewStabilizationTimeoutInMs: {},
forceMinimizeDataMovement: {}",
Review Comment:
nit: in the log just use `minimizeDataMovement` as that is the config name
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/assignment/instance/InstanceAssignmentDriver.java:
##########
@@ -54,41 +54,44 @@ public InstanceAssignmentDriver(TableConfig tableConfig) {
}
public InstancePartitions assignInstances(InstancePartitionsType
instancePartitionsType,
- List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions) {
+ List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions,
+ boolean forceMinimizeDataMovement) {
String tableNameWithType = _tableConfig.getTableName();
InstanceAssignmentConfig assignmentConfig =
InstanceAssignmentConfigUtils.getInstanceAssignmentConfig(_tableConfig,
instancePartitionsType);
return getInstancePartitions(
instancePartitionsType.getInstancePartitionsName(TableNameBuilder.extractRawTableName(tableNameWithType)),
- assignmentConfig, instanceConfigs, existingInstancePartitions, null);
+ assignmentConfig, instanceConfigs, existingInstancePartitions, null,
forceMinimizeDataMovement);
}
public InstancePartitions assignInstances(InstancePartitionsType
instancePartitionsType,
List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions
existingInstancePartitions,
- @Nullable InstancePartitions preConfiguredInstancePartitions) {
+ @Nullable InstancePartitions preConfiguredInstancePartitions, boolean
forceMinimizeDataMovement) {
String tableNameWithType = _tableConfig.getTableName();
InstanceAssignmentConfig assignmentConfig =
InstanceAssignmentConfigUtils.getInstanceAssignmentConfig(_tableConfig,
instancePartitionsType);
return getInstancePartitions(
instancePartitionsType.getInstancePartitionsName(TableNameBuilder.extractRawTableName(tableNameWithType)),
- assignmentConfig, instanceConfigs, existingInstancePartitions,
preConfiguredInstancePartitions);
+ assignmentConfig, instanceConfigs, existingInstancePartitions,
preConfiguredInstancePartitions,
+ forceMinimizeDataMovement);
}
public InstancePartitions assignInstances(String tierName,
List<InstanceConfig> instanceConfigs,
- @Nullable InstancePartitions existingInstancePartitions,
InstanceAssignmentConfig instanceAssignmentConfig) {
+ @Nullable InstancePartitions existingInstancePartitions,
InstanceAssignmentConfig instanceAssignmentConfig,
+ boolean forceMinimizeDataMovement) {
return getInstancePartitions(
InstancePartitionsUtils.getInstancePartitionsNameForTier(_tableConfig.getTableName(),
tierName),
- instanceAssignmentConfig, instanceConfigs, existingInstancePartitions,
null);
+ instanceAssignmentConfig, instanceConfigs, existingInstancePartitions,
null, forceMinimizeDataMovement);
}
private InstancePartitions getInstancePartitions(String
instancePartitionsName,
InstanceAssignmentConfig instanceAssignmentConfig, List<InstanceConfig>
instanceConfigs,
@Nullable InstancePartitions existingInstancePartitions,
- @Nullable InstancePartitions preConfiguredInstancePartitions) {
+ @Nullable InstancePartitions preConfiguredInstancePartitions, boolean
forceMinimizeDataMovement) {
String tableNameWithType = _tableConfig.getTableName();
LOGGER.info("Starting {} instance assignment for table {}",
instancePartitionsName, tableNameWithType);
- boolean minimizeDataMovement =
instanceAssignmentConfig.isMinimizeDataMovement();
+ boolean minimizeDataMovement = forceMinimizeDataMovement ||
instanceAssignmentConfig.isMinimizeDataMovement();
Review Comment:
Also, there are some types of `TableConfigs` for which it is not allowed to
set an instance config.
There is a method `InstanceAssignmentConfigUtils::allowInstanceAssignment`
which checks for this. I think it makes sense to only set
`forceMinimizeDataMovement` to `true` if instance assignment config is even
allowed. You can perhaps log a warning if `forceMinimizeDataMovement` is set to
`true` but instance assignment isn't allowed, and just set
`minimizeDataMovement` to `false` in that case
nit: perhaps also add a comment that explains all of the above
cc @Jackie-Jiang @klsince can you folks confirm if my understanding of the
above is correct?
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/rebalance/RebalanceConfig.java:
##########
@@ -83,6 +83,12 @@ public class RebalanceConfig {
@ApiModelProperty(example = "false")
private boolean _bestEfforts = false;
+ // Whether to enforce Minimal Data Movement Algorithm. If set to false, the
minimizeDataMovement flag in the table
+ // config will be used to determine whether to run the Minimal Data Movement
Algorithm.
+ @JsonProperty("minimizeDataMovement")
+ @ApiModelProperty(example = "true")
+ private boolean _minimizeDataMovement = true;
Review Comment:
Just a thought, should we start with defaulting this to `false`, test it out
in a cluster with a few different scenarios, and then add a new change to
default this to `true` once we confirm it always does the right thing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]