Re: [PR] [FLINK-30593][autoscaler] Determine restart time on the fly fo Autoscaler [flink-kubernetes-operator]

via GitHub Mon, 20 Nov 2023 03:25:25 -0800


afedulov commented on code in PR #711:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/711#discussion_r1399055639



##########
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobAutoScalerImpl.java:
##########
@@ -159,19 +161,24 @@ private void runScalingLogic(Context ctx, 
AutoscalerFlinkMetrics autoscalerMetri
             throws Exception {
 
         var collectedMetrics = metricsCollector.updateMetrics(ctx, stateStore);
+        var jobTopology = collectedMetrics.getJobTopology();
 
         if (collectedMetrics.getMetricHistory().isEmpty()) {
             return;
         }
         LOG.debug("Collected metrics: {}", collectedMetrics);
 
-        var evaluatedMetrics = evaluator.evaluate(ctx.getConfiguration(), 
collectedMetrics);
+        var now = clock.instant();
+        // Scaling tracking data contains previous restart times that are 
taken into account
+        var scalingTracking = getTrimmedScalingTracking(stateStore, ctx, now);
+        var evaluatedMetrics =
+                evaluator.evaluate(ctx.getConfiguration(), collectedMetrics, 
scalingTracking);

Review Comment:
   I don't think this works with the current metrics scoping since it would 
lead to duplicating the restart time per vertex and we are striving to 
minimizing the size of the config map. Also, instead of just checking one 
tracking entry, should we instead iterate over all records of this metric 
across all vertices and take the maximum over that? Or just trust one 
observation? If we'll just use one observation, why would we need this data in 
every vertex and if we know it is supposed to be the same for all vertices, why 
store it at the vertex level? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-30593][autoscaler] Determine restart time on the fly fo Autoscaler [flink-kubernetes-operator]

Reply via email to