mxm commented on code in PR #581:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/581#discussion_r1184795632
##########
flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java:
##########
@@ -98,6 +98,13 @@ private static ConfigOptions.OptionBuilder
autoScalerConfig(String key) {
.withDescription(
"Max scale down factor. 1 means no limit on scale
down, 0.6 means job can only be scaled down with 60% of the original
parallelism.");
+ public static final ConfigOption<Double> MAX_SCALE_UP_FACTOR =
+ autoScalerConfig("scale-up.max-factor")
+ .doubleType()
+ .defaultValue(2.0)
+ .withDescription(
+ "Max scale up factor. 2.0 means job can only be
scaled up with 200% of the original parallelism.");
Review Comment:
```suggestion
"Max scale up factor. 2.0 means job can only be
scaled up with 200% of the current parallelism.");
```
Just to make it clear that it can double every time.
##########
flink-kubernetes-operator-autoscaler/src/test/java/org/apache/flink/kubernetes/operator/autoscaler/BacklogBasedScalingTest.java:
##########
@@ -100,6 +100,7 @@ public void setup() {
defaultConf.set(AutoScalerOptions.CATCH_UP_DURATION,
Duration.ofSeconds(2));
defaultConf.set(AutoScalerOptions.SCALING_ENABLED, true);
defaultConf.set(AutoScalerOptions.MAX_SCALE_DOWN_FACTOR, 1.);
+ defaultConf.set(AutoScalerOptions.MAX_SCALE_UP_FACTOR,
Double.MAX_VALUE);
Review Comment:
In my tests, MAX_VALUE might be problematic because:
```
> Double.MAX_VALUE * 2
Infinity
```
##########
flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java:
##########
@@ -98,6 +98,13 @@ private static ConfigOptions.OptionBuilder
autoScalerConfig(String key) {
.withDescription(
"Max scale down factor. 1 means no limit on scale
down, 0.6 means job can only be scaled down with 60% of the original
parallelism.");
+ public static final ConfigOption<Double> MAX_SCALE_UP_FACTOR =
+ autoScalerConfig("scale-up.max-factor")
+ .doubleType()
+ .defaultValue(2.0)
Review Comment:
I would suggest to set the default to the maximum parallelism, to be able to
scale up from 1 to the max configured parallelism.
##########
flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java:
##########
@@ -98,6 +98,13 @@ private static ConfigOptions.OptionBuilder
autoScalerConfig(String key) {
.withDescription(
"Max scale down factor. 1 means no limit on scale
down, 0.6 means job can only be scaled down with 60% of the original
parallelism.");
+ public static final ConfigOption<Double> MAX_SCALE_UP_FACTOR =
+ autoScalerConfig("scale-up.max-factor")
+ .doubleType()
+ .defaultValue(2.0)
Review Comment:
In our production setup, we would definitely set this to a very high number,
effectively disabling it. The reason is that we have pipelines which sit most
idle (parallelism 1) but then receive huge traffic spikes which need quite high
parallelisms (up to 500) to be able to keep up. The only way we limit the scale
up is by setting a max parallelism.
Being able to scale up quickly is very important to keep up with the backlog
size / growth. If you cap the scale up, you limit the effectiveness of the
scaling. I get where it comes from but generally I think it is curing the
symptoms and not the cause of these spikes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]