[
https://issues.apache.org/jira/browse/FLINK-31898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kyungmin Kim updated FLINK-31898:
---------------------------------
Description:
Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper
parallelism.
I was using 1.4 version but I found that it does not scale down properly
because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.
In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE to
positive infinity and make scaleFactor to very low value so I'm now
experimentally using docker image built with main branch of Flink-k8s-operator
repository in my job.
It now scales down properly but the problem is, it does not converge to the
optimal parallelism. It scales down well but it jumps up again to high
parallelism.
Below is the experimental setup and my figure of parallelism changes result.
* about 40 RPS
* each task can process 10 TPS (intended throttling)
!image-2023-04-24-10-54-58-083.png|width=999,height=266!
Even using default configuration leads to the same result. What can I do more?
Thank you.
was:
Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper
parallelism.
I was using 1.4 version but I found that it does not scale down properly
because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.
In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE to
positive infinity and make scaleFactor to very low value so I'm now
experimentally using docker image built with main branch of Flink-k8s-operator
repository in my job.
It now scales down properly but the problem is, it does not converge to the
optimal parallelism. It scales down well but it jumps up again to high
parallelism.
Below is the experimental setup and my figure of parallelism changes result.
* about 40 RPS
* each task can process 10 TPS (intended throttling)
!image-2023-04-24-10-54-58-083.png!
Even using default configuration leads to the same result. What can I do more?
Thank you.
> Flink k8s autoscaler does not work as expected
> ----------------------------------------------
>
> Key: FLINK-31898
> URL: https://issues.apache.org/jira/browse/FLINK-31898
> Project: Flink
> Issue Type: Improvement
> Reporter: Kyungmin Kim
> Priority: Major
> Attachments: image-2023-04-24-10-54-58-083.png
>
>
> Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper
> parallelism.
> I was using 1.4 version but I found that it does not scale down properly
> because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.
> In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE
> to positive infinity and make scaleFactor to very low value so I'm now
> experimentally using docker image built with main branch of
> Flink-k8s-operator repository in my job.
> It now scales down properly but the problem is, it does not converge to the
> optimal parallelism. It scales down well but it jumps up again to high
> parallelism.
>
> Below is the experimental setup and my figure of parallelism changes result.
> * about 40 RPS
> * each task can process 10 TPS (intended throttling)
> !image-2023-04-24-10-54-58-083.png|width=999,height=266!
> Even using default configuration leads to the same result. What can I do
> more? Thank you.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)