[
https://issues.apache.org/jira/browse/FLINK-38724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yuanfenghu updated FLINK-38724:
-------------------------------
Description:
FLINK-36527 introduced the
job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
configuration to solve the problem of KeyBy and Kafka
The current default value of job.autoscaler.scale-down.max-factor is 0.6, which
means that a vertex can only be scaled down to the original parallelism in a
single scale-down.
Specific scenario:
- Number of Kafka partitions:4
- Current parallelism:4
- Ideal parallelism during trough:2
- Reduction calculation:4 × 0.6 = 2.4
- Due to balanced consumption constraints, 2.4 will be adjusted upward to
4(must be an integer that evenly allocates 4 partitions)
- Result: The vertex cannot be scaled down and always remains at a parallelism
of 4
This violates Autoscaler's goal of reducing resource consumption during low
times
Provides one of two solutions:
Option 1(Interim Option):
Adjust the global default job.autoscaler.scale-down.max-factor to a value of
0.33 or less to support greater scale-down (for example, from 4 to 2).
Option 2(Recommended Option):
Added per-vertex configuration that allows you to specify a separate max-factor
value for a specific vertex, for example:
job.autoscaler.vertex. <vertex-id>.scale-down.max-factor=0.4
was:
FLINK-36527 introduced the
job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
configuration to solve the problem of KeyBy and Kafka
The current default value of job.autoscaler.scale-down.max-factor is 0.6, which
means that a vertex can only be scaled down to the original parallelism in a
single scale-down.
Specific scenario:
- Number of Kafka partitions:4
- Current parallelism:4
- Ideal parallelism during trough:2
- Reduction calculation:4 × 0.6 = 2.4
- Due to balanced consumption constraints, 2.4 will be adjusted upward to
4(must be an integer that evenly allocates 4 partitions)
- Result: The vertex cannot be scaled down and always remains at a parallelism
of 4
This violates Autoscaler's goal of reducing resource consumption during low
times
提供以下两种解决方案之一:
方案 1(临时方案):
调整全局默认值 job.autoscaler.scale-down.max-factor 为 0.33 或更小的值,使其能够支持更大幅度的缩容(例如从 4
缩容到 2)。
方案 2(推荐方案):
新增 per-vertex 配置,允许为特定 vertex 指定独立的 max-factor 值,例如:
job.autoscaler.vertex.<vertex-id>.scale-down.max-factor=0.4
> Allow per-vertex configuration of scale-down.max-factor to support balanced
> partition consumption
> -------------------------------------------------------------------------------------------------
>
> Key: FLINK-38724
> URL: https://issues.apache.org/jira/browse/FLINK-38724
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler
> Affects Versions: kubernetes
> Reporter: yuanfenghu
> Priority: Major
>
> FLINK-36527 introduced the
> job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
> configuration to solve the problem of KeyBy and Kafka
>
> The current default value of job.autoscaler.scale-down.max-factor is 0.6,
> which means that a vertex can only be scaled down to the original parallelism
> in a single scale-down.
>
> Specific scenario:
> - Number of Kafka partitions:4
> - Current parallelism:4
> - Ideal parallelism during trough:2
> - Reduction calculation:4 × 0.6 = 2.4
> - Due to balanced consumption constraints, 2.4 will be adjusted upward to
> 4(must be an integer that evenly allocates 4 partitions)
> - Result: The vertex cannot be scaled down and always remains at a
> parallelism of 4
>
> This violates Autoscaler's goal of reducing resource consumption during low
> times
>
> Provides one of two solutions:
>
> Option 1(Interim Option):
> Adjust the global default job.autoscaler.scale-down.max-factor to a value of
> 0.33 or less to support greater scale-down (for example, from 4 to 2).
>
> Option 2(Recommended Option):
> Added per-vertex configuration that allows you to specify a separate
> max-factor value for a specific vertex, for example:
> job.autoscaler.vertex. <vertex-id>.scale-down.max-factor=0.4
--
This message was sent by Atlassian Jira
(v8.20.10#820010)