kfaraz commented on code in PR #18976:
URL: https://github.com/apache/druid/pull/18976#discussion_r2764058765
##########
docs/ingestion/supervisor.md:
##########
@@ -208,7 +208,10 @@ The following table outlines the configuration properties
related to the `costBa
|`lagWeight`|The weight of extracted lag value in cost function.| No| 0.25|
|`idleWeight`|The weight of extracted poll idle value in cost function. | No |
0.75 |
|`defaultProcessingRate`|A planned processing rate per task, required for
first cost estimations. | No | 1000 |
-|`scaleDownBarrier`| A number of successful scale down attempts which should
be skipped to prevent the auto-scaler from scaling down tasks immediately. |
No | 5 |
+|`useTaskCountBoundaries`|Enables the bounded partitions-per-task window when
selecting task counts.|No|`false`|
Review Comment:
Instead of a boolean flag, should we just make this an integer for the value
of partitions-per-task window?
##########
docs/ingestion/supervisor.md:
##########
@@ -208,7 +208,10 @@ The following table outlines the configuration properties
related to the `costBa
|`lagWeight`|The weight of extracted lag value in cost function.| No| 0.25|
|`idleWeight`|The weight of extracted poll idle value in cost function. | No |
0.75 |
|`defaultProcessingRate`|A planned processing rate per task, required for
first cost estimations. | No | 1000 |
-|`scaleDownBarrier`| A number of successful scale down attempts which should
be skipped to prevent the auto-scaler from scaling down tasks immediately. |
No | 5 |
+|`useTaskCountBoundaries`|Enables the bounded partitions-per-task window when
selecting task counts.|No|`false`|
+|`useBurstScaleOnHeavyLag`|Enables burst scale-up when per-partition lag is
high.|No|`false`|
+|`highLagThreshold`|Per-partition lag threshold that triggers burst scale-up
when `useBurstScaleOnHeavyLag` is enabled.|No|50000|
Review Comment:
Feels like these two can be merged into a single config for simplicity.
Maybe use a config name like `lagThresholdForBurstScaleUp`.
By default, it can be -1 i.e. no burst scaling.
Any value >= 0 would cause burst scaling to get enabled at that lag
threshold.
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/CostBasedAutoScaler.java:
##########
@@ -88,9 +71,11 @@ public class CostBasedAutoScaler implements
SupervisorTaskAutoScaler
private final ServiceMetricEvent.Builder metricBuilder;
private final ScheduledExecutorService autoscalerExecutor;
private final WeightedCostFunction costFunction;
+ private OptimalTaskCountBoundariesPlugin boundariesPlugin = null;
Review Comment:
This is nice! I like the plugin approach.
Couple of suggestions:
- Have each plugin implement a single interface named something like
`CostFunction`.
- The `WeightedCostFunction` could also implement the `CostFunction`.
- The `CostFunction` interface would have a single method:
```java
CostResult computeCost(
CostMetrics metrics,
int proposedTaskCount,
CostBasedAutoScalerConfig config
);
```
- Then we could do something like `f(g(h(weightedCost())))`, where `f`, `g`
and `h` are various plugins on top of the cost function.
```java
CostFunction costFunction = new WeightedCostFunction();
if (burstEnabled) {
costFunction = new BurstFunction(costFunction);
}
if (taskLimitEnabled) {
costFunction = new TaskBoundariesFunction(costFunction);
}
CostResult costResult = costFunction.computeCost();
```
- This approach would allow `WeightedCostFunction` to remain agnostic of all
plugins
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]