[
https://issues.apache.org/jira/browse/FLINK-34726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828210#comment-17828210
]
Gyula Fora commented on FLINK-34726:
------------------------------------
Thanks for the detailed analysis [~Fei Feng] . You are completely right that we
don't optimise the rest client usage and that may add a significant overhead.
We have done similar optimisation in the past for config access/generation by
using the FlinkResourceContext class.
We could probably move the rest client generation logic there instead of hiding
it under the FlinkService completely. This will be however a bigger change as
it will affect the methods of the FlinkService interface as well.
Sounds a bit strange that getSecondaryResource is so expensive as that should
happen from a cache. We should look into it while it's expensive in the first
place because passing the FlinkDeployment objects around will make the code a
bit more complicated, but I guess that could also be hidden under the
FlinkSessionJobContext
> Flink Kubernetes Operator has some room for optimizing performance.
> -------------------------------------------------------------------
>
> Key: FLINK-34726
> URL: https://issues.apache.org/jira/browse/FLINK-34726
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.5.0, kubernetes-operator-1.6.0,
> kubernetes-operator-1.7.0
> Reporter: Fei Feng
> Priority: Major
> Attachments: operator_no_submit_no_kill.flamegraph.html
>
>
> When there is a huge number of FlinkDeployment and FlinkSessionJob in a
> kubernetes cluster, there will be a significant delay between event submit
> into reconcile thread pool and event is processed.
> this is our test:we give operator enough resource(cpu: 10core, memory: 20g,
> reconcile thread pool size was 200 ) and we deployed 10000 jobs firstly (one
> FlinkDeployment and one SessionJob per job) , then we do submit/delete job
> tests. we found that
> 1. it cost about 2min between create new FlinkDeployment and FlinkSessionJob
> CR to k8s and the flink job submited to jobmanager.
> 2. it cost about 1min between delete a FlinkDeployment and FlinkSessionJob CR
> and the flink job and session cluster cleared.
>
> I use async-profiler to get flamegraph when there is a huge number
> FlinkDeployment and FlinkSessionJob. I found two obvious areas for
> optimization
> 1. For Flinkdeployment: in the observe step, we call
> AbstractFlinkService.getClusterInfo/listJobs/getTaskManagerInfo , every time
> we call these method we need create RestClusterClient/ send requests/ close,
> I think we should reuse RestClusterClient as much as possible to avoid
> frequently creating objects to reduce GC pressure
> 2. For FlinkSessionJob (This issue is more obvious): in the whole reconcile
> loop, we call getSecondaryResource 5 times to get FlinkDeployement resource
> info. Based on my current understanding of the Flink Operator, I think we do
> not need to call it 5 times in a single reconcile loop, calling it once is
> enough. If yes, we cloud save 30% cpu usage (every getSecondaryResource cost
> 6% cpu usage)
> [^operator_no_submit_no_kill.flamegraph.html]
> I hope we can discuss solutions to address this problem together. I'm very
> willing to optimize and resolve this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)