Hi all, Thanks for the comments.
If there is no further comment, we will open the voting thread next week. Regards, Xiangyu Zhanghao Chen <zhanghao.c...@outlook.com> 于2024年1月3日周三 16:46写道: > Thanks for driving this effort on improving the interactive use experience > of Flink. The proposal overall looks good to me. > > Best, > Zhanghao Chen > ________________________________ > From: xiangyu feng <xiangyu...@gmail.com> > Sent: Tuesday, December 26, 2023 16:51 > To: dev@flink.apache.org <dev@flink.apache.org> > Subject: [Discuss] FLIP-407: Improve Flink Client performance in > interactive scenarios > > Hi devs, > > I'm opening this thread to discuss FLIP-407: Improve Flink Client > performance in interactive scenarios. The POC test results and design doc > can be found at: FLIP-407 > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-407%3A+Improve+Flink+Client+performance+when+interacting+with+dedicated+Flink+Session+Clusters > > > . > > Currently, Flink Client is mainly designed for one time interaction with > the Flink Cluster. All the resources(http connections, threads, ha > services) and instances(ClusterDescriptor, ClusterClient, RestClient) are > created and recycled for each interaction. This works well when users do > not need to interact frequently with Flink Cluster and also saves resource > usage since resources are recycled immediately after each usage. > > However, in OLAP or StreamingWarehouse scenarios, users might submit > interactive jobs to a dedicated Flink Session Cluster very often. In this > case, we find that for short queries that can finish in less than 1s in > Flink Cluster will still have E2E latency greater than 2s. Hence, we > propose this FLIP to improve the Flink Client performance in this scenario. > This could also improve the user experience when using session debug mode. > > The major change in this FLIP is that there will be a new introduced option > *'execution.interactive-client'*. When this option is enabled, Flink > Client will reuse all the necessary resources to improve interactive > performance, including: HA Services, HTTP connections, threads and all > kinds of instances related to a long-running Flink Cluster. The default > value of this option will be false, then Flink Client will behave as > before. > > Also, this FLIP proposed a configurable RetryStrategy when fetching results > from client-side to Flink Cluster. In interactive scenarios, this can save > more than 15% of TM CPU usage without performance degradation. > > Looking forward to your feedback, thanks. > > Best regards, > Xiangyu >