Hi Pavel,

> For now when there is no data coming into the pipeline (no payload
> going to source), and the pipeline has around 0 CPU load, the Adaptive
> Scaling drops the parallelism of all operators to 1. However, my
> intention would be to have the min-parallelism at 8, so the latency
> will be adequate when the payload appears again.

You could try to set job.autoscaler.vertex.min-parallelism : '8' [1], it's
the minimum parallelism the autoscaler can use. It means when
recommended parallelism is less than 8, autoscaler will use
8 as the final parallelism for the flink job.

I hope it could solve your problem.

[1]
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.10/docs/operations/configuration/#autoscaler-configuration

Best,
Rui


On Tue, Jan 7, 2025 at 4:55 PM Pavel Dmitriev <pavel.dmitr...@sinc.de>
wrote:

> Hi,
> because there was no reply in user mailing list, I am duplicating my
> message to dev channel, hoping to get some insides about my problem :).
>
> I have a question about Elsatic Scaling in Apache Flink.
>
>
> Is there any possibility to set min-parallelism for a pipeline when the
> pipeline keeps silent (no payload) for a long time to minimize the
> latency when the payload appears again?
>
> For now when there is no data coming into the pipeline (no payload
> going to source), and the pipeline has around 0 CPU load, the Adaptive
> Scaling drops the parallelism of all operators to 1. However, my
> intention would be to have the min-parallelism at 8, so the latency
> will be adequate when the payload appears again.
>
> I am trying Elastic Scaling in Adaptive Mode for our Flink pipelines
> with ApacheFlink 1.18 and flink-kubernetes-operator 1.9. My scaling
> settings of the FlinkDeployment are:
>
> ```yaml
>
> spec:
>   flinkConfiguration:
>     cluster.evenly-spread-out-slots: 'true'
>     job.autoscaler.catch-up.duration: 1m
>     job.autoscaler.enabled: 'true'
>     job.autoscaler.metrics.window: 1m
>     job.autoscaler.restart.time: 2m
>     job.autoscaler.scaling.enabled: 'true'
>     job.autoscaler.stabilization.interval: 1m
>     job.autoscaler.target.utilization: '0.6'
>     job.autoscaler.target.utilization.boundary: '0.2'
>     jobmanager.scheduler: adaptive
>     parallelism.default: '8'
>     taskmanager.numberOfTaskSlots: '8'
> ```
>
>
> P.S. Our use case is that our pipelines have no payload half of the
> day, but then a lot of data comes in, and our operators inside the
> pipelines are very CPU intensive, which makes them very slow with
> parallelism 1.
>
> Thank you in advance,
> Pavel.
>

Reply via email to