[
https://issues.apache.org/jira/browse/FLINK-36535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Fan resolved FLINK-36535.
-----------------------------
Fix Version/s: kubernetes-operator-1.11.0
Resolution: Fixed
Merged to main(1.11.0) via: d9e8cce85499f26ac0129a2f2d13a083d68b5c21
> Optimize the scale down logic based on historical parallelism
> -------------------------------------------------------------
>
> Key: FLINK-36535
> URL: https://issues.apache.org/jira/browse/FLINK-36535
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler
> Reporter: Rui Fan
> Assignee: Rui Fan
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.11.0
>
>
> This is a follow-up to FLINK-36018 . FLINK-36018 supported the lazy scale
> down to avoid frequent rescaling.
> h1. Proposed Change
> Treat scale-down.interval as a window:
> * Recording the scale down trigger time when the recommended parallelism <
> current parallelism
> ** When the recommended parallelism >= current parallelism, cancel the
> triggered scale down
> * The scale down will be executed when currentTime - triggerTime >
> scale-down.interval
> ** {color:#de350b}Change1{color}: Using the maximum parallelism within the
> window instead of the latest parallelism when scaling down.
> * {color:#de350b}Change2{color}: Never scale down when currentTime -
> triggerTime < scale-down.interval
> *
> ** In the FLINK-36018, the scale down may be executed when currentTime -
> triggerTime < scale-down.interval.
> ** For example: the taskA may scale down when taskB needs to scale up.
> h1. Background
> Some critical Flink jobs need to scale up in time, but only scale down on a
> daily basis. In other words, Flink users do not want Flink jobs to be scaled
> down multiple times within 24 hours, and jobs run at the same parallelism as
> during the peak hours of each day.
> Note: Users hope to scale down only happens when the parallelism during peak
> hours still wastes resources. This is a trade-off between downtime and
> resource waste for a critical job.
> h1. Current solution
> In general, this requirement could be met after setting{color:#de350b}
> job.autoscaler.scale-down.interval= 24 hour{color}. When taskA runs with 100
> parallelism, and recommended parallelism is 100 during the peak hours of each
> day. We hope taskA doesn't rescale forever, because the triggered scale down
> will be canceled once the recommended parallelism >= current parallelism
> within 24 hours (It‘s exactly what FLINK-36018 does).
> h1. Unexpected Scenario & how to solve?
> But I found the critical production job is still rescaled about 10 times
> every day (when scale-down.interval is set to 24 hours).
> Root cause: There may be many sources in a job, and the traffic peaks of
> these sources may occur at different times. When taskA triggers scale down,
> the scale down of taskA will not be actively executed within 24 hours, but it
> may be executed when other tasks are scaled up.
> For example:
> * The scale down of sourceB and sourceC may be executed when SourceA scales
> up.
> * After a while, the scale down of sourceA and sourceC may be executed when
> SourceB scales up.
> * After a while, the scale down of sourceA and sourceB may be executed when
> SourceC scales up.
> * When there are many tasks, the above 3 steps will be executed repeatedly.
> That's why the job is rescaled about 10 times every day, the
> {color:#de350b}change2{color} of proposed change could solve this issue:
> Never scale down when currentTime - triggerTime < scale-down.interval.
>
> {color:#de350b}Change1{color}: Using the maximum parallelism within the
> window instead of the latest parallelism when scaling down.
> * It can ensure that the parallelism after scaling down is the parallelism
> at yesterday's peak.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)