[ 
https://issues.apache.org/jira/browse/FLINK-32756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangyu feng updated FLINK-32756:
---------------------------------
    Summary: Reuse ClientHighAvailabilityServices in RestClusterClient when 
submitting OLAP jobs  (was: Reues ZK connections when submitting OLAP jobs to 
Flink session cluster)

> Reuse ClientHighAvailabilityServices in RestClusterClient when submitting 
> OLAP jobs
> -----------------------------------------------------------------------------------
>
>                 Key: FLINK-32756
>                 URL: https://issues.apache.org/jira/browse/FLINK-32756
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Client / Job Submission
>            Reporter: xiangyu feng
>            Priority: Major
>
> In OLAP scenario, we submit queries to flink session cluster through the 
> flink-sql-gateway service. When receiving queries, the gateway service will 
> create sessions to handle the query, each session will create a new 
> RestClusterClient to submit queries and a new ClientHAServices to discover 
> the latest address of the JobManager.
> In our production usage, we have enabled JobManager HA and use 
> ZKClientHAServices to do service discovery. Each ZKClientHAServices will 
> establish a network connection with ZK and create four ZK related threads. 
> When QPS reaches 200, more than 1000 sessions are created in a single 
> flink-sql-gateway instance, which means more than 1000 ZK connections and 
> more than 4000 ZK related threads are created simultaneously. This will raise 
> a significant stability risk in production.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to