[
https://issues.apache.org/jira/browse/FLINK-29317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fabian Paul updated FLINK-29317:
--------------------------------
Affects Version/s: 1.15.3
(was: 1.15.2)
> Add WebSocket in Dispatcher to support olap query submission and push results
> in session cluster
> ------------------------------------------------------------------------------------------------
>
> Key: FLINK-29317
> URL: https://issues.apache.org/jira/browse/FLINK-29317
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination, Runtime / REST
> Affects Versions: 1.14.5, 1.15.3
> Reporter: Shammon
> Priority: Major
>
> Currently client submit olap query to flink session cluster via http rest
> api, and pull the results through interval polling. The sink task in
> TaskManager creates socket server for each query, when the JobManager
> receives the pull request from client, it requests query results from the
> socket server. The process is as follows
> Job submission path:
> client -> http rest -> JobManager -> Sink Socket Server
> Result acquisition path:
> client <- http rest <- JobManager <- Sink Socket Server
> This leads to two problems
> 1. There will be some performance loss when submitting jobs through http
> rest, for example, temporary files will be created for each job
> 2. The client pulls the result data at a certain time interval, which is a
> fixed cost. The larger interval leads to increase query latency, the smaller
> interval will increase the pressure of Dispatcher.
> 3. Each sink task initializes a socket server, it will increase the query
> latency, on the other hand, it wastes resources.
> For the Flink OLAP scenario, we propose to add websocket protocol in session
> cluster to support submitting jobs and returning results. The client creates
> and manage a connection with websocket server, submits olap query to session
> cluster. The TaskManagers create and manage connection to websocket server
> too, and sink task sends results to the server in stream. When the JobManager
> receives the results from sink task, it pushes the result data to the client
> through the connection between them.
> We implemented this feature in the internal Flink version of ByteDance. On
> average, the latency of each query can be reduced by about 100ms, it's a big
> optimization for OLAP queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)