Thanks for driving this, Xiangyu!

+1 for the overall proposal. I believe this will enhance the
availability and competitiveness of Flink in OLAP scenarios.

Best,
Yangze Guo

On Tue, Dec 26, 2023 at 4:51 PM xiangyu feng <xiangyu...@gmail.com> wrote:
>
> Hi devs,
>
> I'm opening this thread to discuss FLIP-407: Improve Flink Client
> performance in interactive scenarios. The POC test results and design doc
> can be found at: FLIP-407
> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-407%3A+Improve+Flink+Client+performance+when+interacting+with+dedicated+Flink+Session+Clusters>
> .
>
> Currently, Flink Client is mainly designed for one time interaction with
> the Flink Cluster. All the resources(http connections, threads, ha
> services) and instances(ClusterDescriptor, ClusterClient, RestClient) are
> created and recycled for each interaction. This works well when users do
> not need to interact frequently with Flink Cluster and also saves resource
> usage since resources are recycled immediately after each usage.
>
> However, in OLAP or StreamingWarehouse scenarios, users might submit
> interactive jobs to a dedicated Flink Session Cluster very often. In this
> case, we find that for short queries that can finish in less than 1s in
> Flink Cluster will still have E2E latency greater than 2s. Hence, we
> propose this FLIP to improve the Flink Client performance in this scenario.
> This could also improve the user experience when using session debug mode.
>
> The major change in this FLIP is that there will be a new introduced option
> *'execution.interactive-client'*. When this option is enabled, Flink
> Client will reuse all the necessary resources to improve interactive
> performance, including: HA Services, HTTP connections, threads and all
> kinds of instances related to a long-running Flink Cluster. The default
> value of this option will be false, then Flink Client will behave as before.
>
> Also, this FLIP proposed a configurable RetryStrategy when fetching results
> from client-side to Flink Cluster. In interactive scenarios, this can save
> more than 15% of TM CPU usage without performance degradation.
>
> Looking forward to your feedback, thanks.
>
> Best regards,
> Xiangyu

Reply via email to