Thanks all of you involved in the discussion. I will initiate a vote as soon as possible.
Best regards, Yuepeng Pan Rui Fan <[email protected]> 于2026年1月16日周五 14:52写道: > Thanks everyone for the valuable discussion! > > Sounds make sense to only support rebalance and rescale in this FLIP. > > Best, > Rui > > > On Fri, Jan 16, 2026 at 7:49 AM Rui Fan <[email protected]> wrote: > > > Thanks everyone for the valuable discussion! > > > > > > > > On Fri, Jan 16, 2026 at 7:08 AM yuanfeng hu <[email protected]> > > wrote: > > > >> Thanks for the quick update and the ongoing discussion. > >> > >> I would like to +1 for focusing only on rebalance and rescale and > >> excluding shuffle from the adaptive selection logic for now. > >> > >> My reasoning is that when a user explicitly chooses the shuffle() > >> partitioner, it often implies a requirement for statistical randomness. > >> Introducing adaptive logic based on downstream load would inherently > break > >> this randomness. In certain scenarios, such as Machine Learning (ML) > data > >> preparation or specific statistical sampling tasks, the uniform > randomness > >> provided by shuffle is a functional requirement rather than just a > >> distribution strategy. > >> > >> By only supporting rebalance and rescale, we respect the user's explicit > >> intent for those specific operators while avoiding potential side > effects > >> in workloads that rely on the stochastic nature of shuffling. > >> > >> > >> > >> Best regards, > >> > >> Yuanfeng > >> > >> > >> > 2026年1月16日 11:30,Yuepeng Pan <[email protected]> 写道: > >> > > >> > Thanks Rui Ran and Zhanghao Chen for the comments. > >> > > >> >> Agree with Rui that we should exclude customPartition as it may bound > >> to > >> > the specific downstream subtask depending on the implementation. > >> >> IIUC, the customPartition is bound to the specific downstream > subtask, > >> > right? > >> > > >> > Yes, You're right. Sorry for the dirt from the previous design. I > >> deleted > >> > the related description from the main design. > >> > > >> >> [From Hao]: I'm also hesitated on whether we should cover shuffle as > >> > well. Shuffle partitioner just > >> >> randomly chooses a downstream task, and the idea of adaptivity breaks > >> > the randomness. > >> >> I'm not sure if any one would ever rely on the randomness, looking > >> > forward to community ideas on it. > >> > > >> >> [From Rui]: could you please explain in FLIP how to find the suitable > >> > downstream > >> >> subtask for different partitioners? > >> > > >> > In theory, the adaptive modes of RescalePartitioner and > >> > RebalancePartitioner can share the same logic when searching for the > >> > optimal partition, as shown in [1]. > >> > > >> > Regarding whether to support the adaptive shuffle() partitioning > mode, I > >> > take a neutral stance—either supporting it or not is acceptable. One > of > >> the > >> > main reasons is that we should try to avoid introducing yet another > >> > parameter to control whether adaptivity is enabled for shuffle(), > since > >> we > >> > already have quite a few parameters. Introducing a new parameter only > to > >> > support a small number of scenarios without sufficient user feedback > >> does > >> > not seem necessary. > >> > > >> > If we do decide to support it, the strategy for finding the optimal > >> > partition in adaptive shuffle() would be as follows: perform random > >> > selection for the number of times specified by > >> > taskmanager.network.adaptive-partitioner.max-traverse-size, record the > >> list > >> > of partition candidates, and then choose, as the final downstream task > >> to > >> > write data to, the task whose downstream data partition currently has > >> the > >> > least buffered data. > >> > > >> > Looking forward to your comments. > >> > > >> > [1] > >> > > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducetheAdaptiveLoadBasedRecordWriter > >> > > >> > > >> > Best regards, > >> > Yuepeng Pan > >> > > >> > > >> > > >> > > >> > Zhanghao Chen <[email protected]> 于2026年1月16日周五 10:49写道: > >> > > >> >> Thanks Yuepeng and Rui for the discussion. Agree with Rui that we > >> should > >> >> exclude customPartition as it may bound to the specific downstream > >> subtask > >> >> depending on the implementation. I'm also hesitated on whether we > >> should > >> >> cover shuffle as well. Shuffle partitioner just randomly chooses a > >> >> downstream task, and the idea of adaptivity breaks the randomness. > I'm > >> not > >> >> sure if any one would ever rely on the randomness, looking forward to > >> >> community ideas on it. > >> >> > >> >> Best, > >> >> Zhanghao Chen > >> >> ________________________________ > >> >> From: Rui Fan <[email protected]> > >> >> Sent: Thursday, January 15, 2026 16:31 > >> >> To: [email protected] <[email protected]> > >> >> Subject: Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection > >> for > >> >> StreamPartitioner > >> >> > >> >> Thanks Yuepeng for the update. > >> >> > >> >>> For [(Potential)Data not bound to subtask] scenarios (4 partition > >> >> strategies: *Rebalance, Rescale, Shuffle, customPartition*), > >> >>> adjusting dynamically the subtask of the received data according to > >> the > >> >> processing load of the downstream operator would be > >> >>> good to achieve the effect of peak-shaving and valley-filling, and > >> try to > >> >> ensure the throughput of flink jobs. > >> >> > >> >> IIUC, the customPartition is bound to the specific downstream > subtask, > >> >> right? > >> >> > >> >> Also, could you please explain in FLIP how to find the suitable > >> downstream > >> >> subtask for different partitioners? > >> >> Specifically > >> >> > >> >> Best, > >> >> Rui > >> >> > >> >> On Thu, Jan 15, 2026 at 7:37 AM Yuepeng Pan <[email protected]> > >> >> wrote: > >> >> > >> >>> Hi, devs. > >> >>> > >> >>> FYI about the review comments from Mang Zhang(Thanks): > >> >>> > >> >>>> Hi, Yuepeng, > >> >>>> > >> >>>> Thanks! > >> >>>> > >> >>>> Regarding FLIP-339, the current design is exactly what I originally > >> >>> envisioned. > >> >>>> Whether using Flink JAR or SQL, the cost to users is low, and it is > >> more > >> >>> versatile. > >> >>>> LGTM > >> >>>> > >> >>> > >> >>>> -- > >> >>> > >> >>>> Best regards, > >> >>> > >> >>>> Mang Zhang > >> >>> > >> >>> > >> >>> Best regards, > >> >>> > >> >>> Yuepeng Pan > >> >>> > >> >>> > >> >>> > >> >>> Yuepeng Pan <[email protected]> 于2025年12月16日周二 12:02写道: > >> >>> > >> >>>> Thanks Mang for the review. > >> >>>> > >> >>>> Since the names of the subtitles are somewhat ambiguous and may > lead > >> to > >> >>>> misunderstandings about not preserving the original interface, > >> >>>> I have made some corresponding change[1]. > >> >>>> > >> >>>> Hope this helps. Please take a look if you had the free time. > >> >>>> Thank you. > >> >>>> > >> >>>> [1] > >> >>>> > >> >>> > >> >> > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducethenewmethodsatAPIlevel > >> >>>> : > >> >>>> > >> >>>> Best regards, > >> >>>> Yuepeng Pan > >> >>>> > >> >>>> Mang Zhang <[email protected]> 于2025年12月16日周二 10:53写道: > >> >>>> > >> >>>>> hi Yuepeng > >> >>>>> Thank you for continuing to drive this FLIP forward. > >> >>>>> Regarding the changes to the DataStream API in FLIP, I haven't > >> >> observed > >> >>>>> any forward compatibility with the original API. Could you explain > >> the > >> >>>>> rationale behind this design choice? > >> >>>>> Forward-compatible APIs reduce the cost for users to upgrade Flink > >> >>>>> versions. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> > >> >>>>> Best regards, > >> >>>>> Mang Zhang > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> At 2025-12-15 12:40:00, "Pan Yuepeng" <[email protected]> > >> wrote: > >> >>>>>> Hi devs, > >> >>>>>> > >> >>>>>> I re-sorted out and supplemented the 'FLIP-339[1] Support > Adaptive > >> >>>>>> Partition Selection for StreamPartitioner' based on Flink > JIRA[2]. > >> >>>>>> > >> >>>>>> Flink offers multiple partition strategies, some of which bind > data > >> >> to > >> >>>>>> downstream subtasks, while others do not (e.g., shuffle, rescale, > >> >>>>>> rebalance). > >> >>>>>> For [Data not bound to subtasks] scenarios, overloaded > >> sub-task-nodes > >> >>> may > >> >>>>>> slow down the processing of Flink jobs, leading to backpressure > and > >> >>> data > >> >>>>>> lag. Dynamically adjusting the partition of data to subtasks > based > >> on > >> >>> the > >> >>>>>> processing load of downstream operators helps achieve a > >> peak-shaving > >> >>> and > >> >>>>>> valley-filling effect, thereby striving to maintain the > throughput > >> of > >> >>>>> Flink > >> >>>>>> jobs. > >> >>>>>> > >> >>>>>> The raw discussions could be found in the Flink JIRA[2]. > >> >>>>>> I really appreciate developers involved in the discussion for the > >> >>>>> valuable > >> >>>>>> help and suggestions in advance. > >> >>>>>> > >> >>>>>> Please refer to the FLIP[1] wiki for more details about the > >> proposed > >> >>>>> design > >> >>>>>> and implementation. > >> >>>>>> > >> >>>>>> Welcome any feedback and opinions on this proposal. > >> >>>>>> > >> >>>>>> [1] https://cwiki.apache.org/confluence/x/nYyzDw > >> >>>>>> [2] https://issues.apache.org/jira/browse/FLINK-31655 > >> >>>>>> > >> >>>>>> Best regards, > >> >>>>>> Yuepeng Pan > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > >> >
