Re: [PR] [FLINK-30593][autoscaler] Determine restart time on the fly fo Autoscaler [flink-kubernetes-operator]

via GitHub Tue, 14 Nov 2023 08:02:19 -0800


mxm commented on PR #711:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/711#issuecomment-1810524690


   Commenting on some of the requests: 
   >     1. Remove the newly introduced configs, this should be automatic and 
always on
   
   I think it is fair to have a ON/OFF switch. It should be on by default but 
we want to keep the ability to roll back to the old behavior.
   
   >     2. Track the start/end times for the restart in memory and only record 
the `observed_restart_time` in the autoscaler state store. This way we add 
minimal extra state that is easy to implement.
   
   We already have the start time of the last scaling in memory via the scaling 
history. We can then keep note of the end time once we detect the scaling is 
over. That leaves a little bit of error in case of downtime of the operator 
which will produce a long rescaling time. I think that should be fine though, 
since we cap at the max configured rescale time. 
    
   >     3. Instead of computing restart time from a fixed number of samples, 
use a simple moving average: `observed_restart_time = (prev_observed + 
new_observed) / 2`
   
   I think we can do an exponentially weighted average.
   
   > 
   >     4. During autoscaler logic use: `restart_time = min(conf_restart_time, 
observed_restart_time)`
   
   +1
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-30593][autoscaler] Determine restart time on the fly fo Autoscaler [flink-kubernetes-operator]

Reply via email to