[
https://issues.apache.org/jira/browse/FLINK-38724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
yuanfenghu updated FLINK-38724:
-------------------------------
Description:
FLINK-36527 introduced the
job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
configuration to solve the problem of KeyBy and Kafka
The current default value of job.autoscaler.scale-down.max-factor is 0.6, which
means that a vertex can only be scaled down to the original parallelism in a
single scale-down.
Specific scenario:
- Number of Kafka partitions:4
- Current parallelism:4
- Ideal parallelism during trough:2
- Reduction calculation:4 × 0.6 = 2.4
- Due to balanced consumption constraints, 2.4 will be adjusted upward to
4(must be an integer that evenly allocates 4 partitions)
- Result: The vertex cannot be scaled down and always remains at a parallelism
of 4
This violates Autoscaler's goal of reducing resource consumption during low
times
提供以下两种解决方案之一:
方案 1(临时方案):
调整全局默认值 job.autoscaler.scale-down.max-factor 为 0.33 或更小的值,使其能够支持更大幅度的缩容(例如从 4
缩容到 2)。
方案 2(推荐方案):
新增 per-vertex 配置,允许为特定 vertex 指定独立的 max-factor 值,例如:
job.autoscaler.vertex.<vertex-id>.scale-down.max-factor=0.4
was:
FLINK-36527 introduced the
job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
configuration to solve the problem of KeyBy and Kafka
The current default value of job.autoscaler.scale-down.max-factor is 0.6, which
means that a vertex can only be scaled down to the original parallelism in a
single scale-down.
Specific scenario:
- Number of Kafka partitions:4
- Current parallelism:4
- Ideal parallelism during trough:2
- Reduction calculation:4 × 0.6 = 2.4
- Due to balanced consumption constraints, 2.4 will be adjusted upward to
4(must be an integer that evenly allocates 4 partitions)
- Result: The vertex cannot be scaled down and always remains at a parallelism
of 4
This violates Autoscaler's goal of reducing resource consumption during low
times.
> Allow per-vertex configuration of scale-down.max-factor to support balanced
> partition consumption
> -------------------------------------------------------------------------------------------------
>
> Key: FLINK-38724
> URL: https://issues.apache.org/jira/browse/FLINK-38724
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler
> Affects Versions: kubernetes
> Reporter: yuanfenghu
> Priority: Major
>
> FLINK-36527 introduced the
> job.autoscaler.scaling.key-group.partitions.adjust.mode=EVENLY_SPREAD
> configuration to solve the problem of KeyBy and Kafka
>
> The current default value of job.autoscaler.scale-down.max-factor is 0.6,
> which means that a vertex can only be scaled down to the original parallelism
> in a single scale-down.
>
> Specific scenario:
> - Number of Kafka partitions:4
> - Current parallelism:4
> - Ideal parallelism during trough:2
> - Reduction calculation:4 × 0.6 = 2.4
> - Due to balanced consumption constraints, 2.4 will be adjusted upward to
> 4(must be an integer that evenly allocates 4 partitions)
> - Result: The vertex cannot be scaled down and always remains at a
> parallelism of 4
>
> This violates Autoscaler's goal of reducing resource consumption during low
> times
>
> 提供以下两种解决方案之一:
> 方案 1(临时方案):
> 调整全局默认值 job.autoscaler.scale-down.max-factor 为 0.33 或更小的值,使其能够支持更大幅度的缩容(例如从
> 4 缩容到 2)。
> 方案 2(推荐方案):
> 新增 per-vertex 配置,允许为特定 vertex 指定独立的 max-factor 值,例如:
> job.autoscaler.vertex.<vertex-id>.scale-down.max-factor=0.4
--
This message was sent by Atlassian Jira
(v8.20.10#820010)