Only one strategy is fine to me. When the multiplier is set to 1, the exponential-delay will become fixed-delay. So fixed-delay may not be needed.
Best, Rui On Mon, Jan 8, 2024 at 2:17 PM Yong Fang <zjur...@gmail.com> wrote: > I agree with @Rui that the current configuration for Flink Client is a > little complex. Can we just provide one strategy with less configuration > items for all scenarios? > > Best, > Fang Yong > > On Mon, Jan 8, 2024 at 11:19 AM Rui Fan <1996fan...@gmail.com> wrote: > > > Thanks xiangyu for driving this proposal! And sorry for the > > late reply. > > > > Overall looks good to me, I only have some minor questions: > > > > 1. Do we need to introduce 3 collect strategies in the first version? > > > > Large and comprehensive configuration items will bring > > additional learning costs and usage costs to users. I tend to > > provide users with out-of-the-box parameters and 2 collect > > strategies may be enough for users. > > > > IIUC, there is no big difference between exponential-delay and > > incremental-delay, especially the default parameters provided. > > I wonder could we provide a multiplier for exponential-delay strategy > > and removing the incremental-delay strategy? > > > > Of course, if you think multiplier option is not needed based on > > your production experience, it's totally fine for me. Simple is better. > > > > 2. Which strategy do you think is best in mass production? > > > > I'm working on FLIP-364[1], it's related to Flink failover restart > > strategy. IIUC, when one cluster only has a few flink jobs, > > fixed-delay is fine. It guarantees minimal latency without too > > much stress. But if one cluster has too many jobs, fixed-delay > > may not be stable. > > > > Do you think exponential-delay is better than fixed delay in this > > scenario? And which strategy is used in your production for now? > > Would you mind sharing it? > > > > Looking forwarding to your opinion~ > > > > Best, > > Rui > > > > On Sat, Jan 6, 2024 at 5:54 PM xiangyu feng <xiangyu...@gmail.com> > wrote: > > > > > Hi all, > > > > > > Thanks for the comments. > > > > > > If there is no further comment, we will open the voting thread next > week. > > > > > > Regards, > > > Xiangyu > > > > > > Zhanghao Chen <zhanghao.c...@outlook.com> 于2024年1月3日周三 16:46写道: > > > > > > > Thanks for driving this effort on improving the interactive use > > > experience > > > > of Flink. The proposal overall looks good to me. > > > > > > > > Best, > > > > Zhanghao Chen > > > > ________________________________ > > > > From: xiangyu feng <xiangyu...@gmail.com> > > > > Sent: Tuesday, December 26, 2023 16:51 > > > > To: dev@flink.apache.org <dev@flink.apache.org> > > > > Subject: [Discuss] FLIP-407: Improve Flink Client performance in > > > > interactive scenarios > > > > > > > > Hi devs, > > > > > > > > I'm opening this thread to discuss FLIP-407: Improve Flink Client > > > > performance in interactive scenarios. The POC test results and design > > doc > > > > can be found at: FLIP-407 > > > > < > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-407%3A+Improve+Flink+Client+performance+when+interacting+with+dedicated+Flink+Session+Clusters > > > > > > > > > . > > > > > > > > Currently, Flink Client is mainly designed for one time interaction > > with > > > > the Flink Cluster. All the resources(http connections, threads, ha > > > > services) and instances(ClusterDescriptor, ClusterClient, RestClient) > > are > > > > created and recycled for each interaction. This works well when users > > do > > > > not need to interact frequently with Flink Cluster and also saves > > > resource > > > > usage since resources are recycled immediately after each usage. > > > > > > > > However, in OLAP or StreamingWarehouse scenarios, users might submit > > > > interactive jobs to a dedicated Flink Session Cluster very often. In > > this > > > > case, we find that for short queries that can finish in less than 1s > in > > > > Flink Cluster will still have E2E latency greater than 2s. Hence, we > > > > propose this FLIP to improve the Flink Client performance in this > > > scenario. > > > > This could also improve the user experience when using session debug > > > mode. > > > > > > > > The major change in this FLIP is that there will be a new introduced > > > option > > > > *'execution.interactive-client'*. When this option is enabled, Flink > > > > Client will reuse all the necessary resources to improve interactive > > > > performance, including: HA Services, HTTP connections, threads and > all > > > > kinds of instances related to a long-running Flink Cluster. The > default > > > > value of this option will be false, then Flink Client will behave as > > > > before. > > > > > > > > Also, this FLIP proposed a configurable RetryStrategy when fetching > > > results > > > > from client-side to Flink Cluster. In interactive scenarios, this can > > > save > > > > more than 15% of TM CPU usage without performance degradation. > > > > > > > > Looking forward to your feedback, thanks. > > > > > > > > Best regards, > > > > Xiangyu > > > > > > > > > >