[
https://issues.apache.org/jira/browse/FLINK-34566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fei Feng updated FLINK-34566:
-----------------------------
Attachment: image-2024-03-04-11-30-53-118.png
> Flink Kubernetes Operator reconciliation parallelism setting not work
> ---------------------------------------------------------------------
>
> Key: FLINK-34566
> URL: https://issues.apache.org/jira/browse/FLINK-34566
> Project: Flink
> Issue Type: Bug
> Components: Kubernetes Operator
> Affects Versions: kubernetes-operator-1.7.0
> Reporter: Fei Feng
> Priority: Blocker
> Attachments: image-2024-03-04-10-58-37-679.png,
> image-2024-03-04-11-17-22-877.png, image-2024-03-04-11-30-53-118.png
>
>
> After upgrade JOSDK to version 4.4.2 from version 4.3.0 in FLINK-33005 , we
> can not enlarge reconciliation parallelism , and the maximum reconciliation
> parallelism was 10. This results FlinkDeployment and SessionJob 's
> reconciliation delay about 10-20 seconds where we have a large scale flink
> session cluster and flink jobs。
>
> After investigating and validating, I found the reason is the logic for
> reconciliation thread pool creation in JOSDK has changed significantly
> between this two version.
> v4.3.0:
> reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize
> was same as corePoolSize), so we pass the reconciliation thread and get a
> thread pool that matches our expectations.
> !image-2024-03-04-10-58-37-679.png|width=628,height=115!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198]
>
> but in v4.2.0:
> the reconciliation thread pool was created as a customer executor which we
> can pass corePoolSize and maximumPoolSize to create this thread pool.The
> problem is that we only set the maximumPoolSize of the thread pool, while,
> the corePoolSize of the thread pool is defaulted to 10. This causes thread
> pool size was only 10 and majority of events would be placed in the workQueue
> for a while.
> !image-2024-03-04-11-17-22-877.png|width=594,height=117!
> https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)