[ 
https://issues.apache.org/jira/browse/FLINK-34566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Feng updated FLINK-34566:
-----------------------------
    Attachment: image-2024-03-04-11-30-53-118.png

> Flink Kubernetes Operator reconciliation parallelism setting not work
> ---------------------------------------------------------------------
>
>                 Key: FLINK-34566
>                 URL: https://issues.apache.org/jira/browse/FLINK-34566
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.7.0
>            Reporter: Fei Feng
>            Priority: Blocker
>         Attachments: image-2024-03-04-10-58-37-679.png, 
> image-2024-03-04-11-17-22-877.png, image-2024-03-04-11-30-53-118.png
>
>
> After upgrade JOSDK to version 4.4.2 from version 4.3.0 in FLINK-33005 , we 
> can not enlarge reconciliation parallelism , and the maximum reconciliation 
> parallelism was 10. This results FlinkDeployment and SessionJob 's 
> reconciliation delay about 10-20 seconds where we have a large scale  flink 
> session cluster and flink jobs。
>  
> After investigating and validating, I found the reason is the logic for 
> reconciliation thread pool creation in JOSDK has changed significantly 
> between this two version. 
> v4.3.0: 
> reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize 
> was same as corePoolSize), so we pass the reconciliation thread and get a 
> thread pool that matches our expectations.
> !image-2024-03-04-10-58-37-679.png|width=628,height=115!
> [https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198]
>  
> but in v4.2.0:
> the reconciliation thread pool was created as a customer executor which we 
> can pass corePoolSize and maximumPoolSize to create this thread pool.The 
> problem is that we only set the maximumPoolSize of the thread pool, while, 
> the corePoolSize of the thread pool is defaulted to 10. This causes thread 
> pool size was only 10 and majority of events would be placed in the workQueue 
> for a while.  
> !image-2024-03-04-11-17-22-877.png|width=594,height=117!
> https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to