[jira] [Updated] (FLINK-31898) Flink k8s autoscaler does not work as expected

Kyungmin Kim (Jira) Sun, 23 Apr 2023 19:00:45 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-31898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kyungmin Kim updated FLINK-31898:
---------------------------------
    Description: 
Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper 
parallelism.

I was using 1.4 version but I found that it does not scale down properly 
because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.

In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE to 
positive infinity and make scaleFactor to very low value so I'm now 
experimentally using docker image built with main branch of Flink-k8s-operator 
repository in my job.

It now scales down properly but the problem is, it does not converge to the 
optimal parallelism. It scales down well but it jumps up again to high 
parallelism. 

 

Below is the experimental setup and my figure of parallelism changes result.
 * about 40 RPS
 * each task can process 10 TPS (intended throttling)

!image-2023-04-24-10-54-58-083.png|width=999,height=266!

Even using default configuration leads to the same result. What can I do more? 
Thank you.

 

  was:
Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper 
parallelism.

I was using 1.4 version but I found that it does not scale down properly 
because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.

In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE to 
positive infinity and make scaleFactor to very low value so I'm now 
experimentally using docker image built with main branch of Flink-k8s-operator 
repository in my job.

It now scales down properly but the problem is, it does not converge to the 
optimal parallelism. It scales down well but it jumps up again to high 
parallelism. 

 

Below is the experimental setup and my figure of parallelism changes result.
 * about 40 RPS
 * each task can process 10 TPS (intended throttling)

!image-2023-04-24-10-54-58-083.png!

Even using default configuration leads to the same result. What can I do more? 
Thank you.

 


> Flink k8s autoscaler does not work as expected
> ----------------------------------------------
>
>                 Key: FLINK-31898
>                 URL: https://issues.apache.org/jira/browse/FLINK-31898
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Kyungmin Kim
>            Priority: Major
>         Attachments: image-2023-04-24-10-54-58-083.png
>
>
> Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper 
> parallelism.
> I was using 1.4 version but I found that it does not scale down properly 
> because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.
> In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE 
> to positive infinity and make scaleFactor to very low value so I'm now 
> experimentally using docker image built with main branch of 
> Flink-k8s-operator repository in my job.
> It now scales down properly but the problem is, it does not converge to the 
> optimal parallelism. It scales down well but it jumps up again to high 
> parallelism. 
>  
> Below is the experimental setup and my figure of parallelism changes result.
>  * about 40 RPS
>  * each task can process 10 TPS (intended throttling)
> !image-2023-04-24-10-54-58-083.png|width=999,height=266!
> Even using default configuration leads to the same result. What can I do 
> more? Thank you.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31898) Flink k8s autoscaler does not work as expected

Reply via email to