Re: [PR] [FLINK-32002] Adjust autoscaler defaults for release [flink-kubernetes-operator]

via GitHub Tue, 30 Apr 2024 08:06:00 -0700


trystanj commented on code in PR #586:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/586#discussion_r1584990017



##########
flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/config/AutoScalerOptions.java:
##########
@@ -68,15 +68,16 @@ private static ConfigOptions.OptionBuilder 
autoScalerConfig(String key) {
     public static final ConfigOption<Double> TARGET_UTILIZATION_BOUNDARY =
             autoScalerConfig("target.utilization.boundary")
                     .doubleType()
-                    .defaultValue(0.1)
+                    .defaultValue(0.4)

Review Comment:
   Thanks, that makes a lot of sense! Is catchup data rate determined by 
literal timestamps compared against the catchup duration? eg if a record was 
placed in kafka 10m ago, and our expected catchup duration is 5m, then are we 
5m behind, or are we still 10m behind? just trying to get a better sense of 
"catch up" statistics!
   
   Perhaps our problem is that lag, for every single job tracked (operator 1.7, 
Flink 1.18.1, all using `KafkaSource`), is `N/A`. At least according to the 
exposed operator metrics themselves. If the operator can't see the lag then 
maybe it can't make an informed decision? I'm wondering if this is a bug on our 
configuration or maybe I'm just way off base. I should expect to see values for 
`LAG_Current`, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-32002] Adjust autoscaler defaults for release [flink-kubernetes-operator]

Reply via email to