[
https://issues.apache.org/jira/browse/FLINK-31898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715632#comment-17715632
]
Kyungmin Kim edited comment on FLINK-31898 at 4/25/23 12:50 AM:
----------------------------------------------------------------
Hi I found that after upgrading flink version 1.16.1 to 1.17.0, the
numRecordsIn/OutPerSecond metric of source operator doubled.
!image-2023-04-24-16-47-35-697.png|width=554,height=191!
I upgraded at 04/05.
Same issue posted below.
https://issues.apache.org/jira/browse/FLINK-31752?jql=project%20%3D%20FLINK%20AND%20text%20~%20numrecordsout
I found that in 'computeTargetDataRate' method of main branch, the
'outputRateMultiplier' value is multiplied to the 'inputTargetRate' and I think
it might effect on the 'TARGET_DATA_RATE' value which is used in calculating
the scaleFactor (bigger than expected).
Again, I'm using the main branch(latest) not the 1.14.0 version.
I hope it helps. Thank you!
+ Even when I use the code fixing the bug above, the scale factor is decreased
by half but autoscaler still increases the parallelism high and repeats the
scale down again.
was (Author: JIRAUSER299786):
Hi I found that after upgrading flink version 1.16.1 to 1.17.0, the
numRecordsIn/OutPerSecond metric of source operator doubled.
!image-2023-04-24-16-47-35-697.png|width=554,height=191!
I upgraded at 04/05.
Same issue posted below.
https://issues.apache.org/jira/browse/FLINK-31752?jql=project%20%3D%20FLINK%20AND%20text%20~%20numrecordsout
I found that in 'computeTargetDataRate' method of main branch, the
'outputRateMultiplier' value is multiplied to the 'inputTargetRate' and I think
it might effect on the 'TARGET_DATA_RATE' value which is used in calculating
the scaleFactor (bigger than expected).
Again, I'm using the main branch(latest) not the 1.14.0 version.
I hope it helps. Thank you!
> Flink k8s autoscaler does not work as expected
> ----------------------------------------------
>
> Key: FLINK-31898
> URL: https://issues.apache.org/jira/browse/FLINK-31898
> Project: Flink
> Issue Type: Bug
> Components: Autoscaler, Kubernetes Operator
> Affects Versions: kubernetes-operator-1.4.0
> Reporter: Kyungmin Kim
> Priority: Major
> Attachments: image-2023-04-24-10-54-58-083.png,
> image-2023-04-24-13-27-17-478.png, image-2023-04-24-13-28-15-462.png,
> image-2023-04-24-13-31-06-420.png, image-2023-04-24-13-41-43-040.png,
> image-2023-04-24-13-42-40-124.png, image-2023-04-24-13-43-49-431.png,
> image-2023-04-24-13-44-17-479.png, image-2023-04-24-14-18-12-450.png,
> image-2023-04-24-16-47-35-697.png
>
>
> Hi I'm using Flink k8s autoscaler to automatically deploy jobs in proper
> parallelism.
> I was using 1.4 version but I found that it does not scale down properly
> because TRUE_PROCESSING_RATE becoming NaN when the tasks are idled.
> In the main branch, I checked the code was fixed to set TRUE_PROCESSING_RATE
> to positive infinity and make scaleFactor to very low value so I'm now
> experimentally using docker image built with main branch of
> Flink-k8s-operator repository in my job.
> It now scales down properly but the problem is, it does not converge to the
> optimal parallelism. It scales down well but it jumps up again to high
> parallelism.
>
> Below is the experimental setup and my figure of parallelism changes result.
> * about 40 RPS
> * each task can process 10 TPS (intended throttling)
> !image-2023-04-24-10-54-58-083.png|width=999,height=266!
> Even using default configuration leads to the same result. What can I do
> more? Thank you.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)