mxm commented on code in PR #581:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/581#discussion_r1187695299
##########
flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java:
##########
@@ -98,6 +98,13 @@ private static ConfigOptions.OptionBuilder
autoScalerConfig(String key) {
.withDescription(
"Max scale down factor. 1 means no limit on scale
down, 0.6 means job can only be scaled down with 60% of the original
parallelism.");
+ public static final ConfigOption<Double> MAX_SCALE_UP_FACTOR =
+ autoScalerConfig("scale-up.max-factor")
+ .doubleType()
+ .defaultValue(2.0)
Review Comment:
@X-czh Great comments! You are right that the autoscaler configuration is
still quite important to handle load skew in big jobs. Concerning the
constraints you listed, we definitely also fall under (1) although we are
elastic to a large degree. It is definitely a good idea to configure the max
parallelism even in an elastic cluster.
Concerning (4) I'd be interested what you mean by massive failures? Jobs
taking other jobs resources? One of the biggest concerns with the autoscaler
currently is the lack of resource pre-allocation. Fortunately, this issue will
soon be fixed once we start using Flink's Rescale API in the autoscaler instead
of triggering a full redeploy every time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]