Thanks everyone for the valuable discussion!
On Fri, Jan 16, 2026 at 7:08 AM yuanfeng hu <[email protected]> wrote: > Thanks for the quick update and the ongoing discussion. > > I would like to +1 for focusing only on rebalance and rescale and > excluding shuffle from the adaptive selection logic for now. > > My reasoning is that when a user explicitly chooses the shuffle() > partitioner, it often implies a requirement for statistical randomness. > Introducing adaptive logic based on downstream load would inherently break > this randomness. In certain scenarios, such as Machine Learning (ML) data > preparation or specific statistical sampling tasks, the uniform randomness > provided by shuffle is a functional requirement rather than just a > distribution strategy. > > By only supporting rebalance and rescale, we respect the user's explicit > intent for those specific operators while avoiding potential side effects > in workloads that rely on the stochastic nature of shuffling. > > > > Best regards, > > Yuanfeng > > > > 2026年1月16日 11:30,Yuepeng Pan <[email protected]> 写道: > > > > Thanks Rui Ran and Zhanghao Chen for the comments. > > > >> Agree with Rui that we should exclude customPartition as it may bound to > > the specific downstream subtask depending on the implementation. > >> IIUC, the customPartition is bound to the specific downstream subtask, > > right? > > > > Yes, You're right. Sorry for the dirt from the previous design. I deleted > > the related description from the main design. > > > >> [From Hao]: I'm also hesitated on whether we should cover shuffle as > > well. Shuffle partitioner just > >> randomly chooses a downstream task, and the idea of adaptivity breaks > > the randomness. > >> I'm not sure if any one would ever rely on the randomness, looking > > forward to community ideas on it. > > > >> [From Rui]: could you please explain in FLIP how to find the suitable > > downstream > >> subtask for different partitioners? > > > > In theory, the adaptive modes of RescalePartitioner and > > RebalancePartitioner can share the same logic when searching for the > > optimal partition, as shown in [1]. > > > > Regarding whether to support the adaptive shuffle() partitioning mode, I > > take a neutral stance—either supporting it or not is acceptable. One of > the > > main reasons is that we should try to avoid introducing yet another > > parameter to control whether adaptivity is enabled for shuffle(), since > we > > already have quite a few parameters. Introducing a new parameter only to > > support a small number of scenarios without sufficient user feedback does > > not seem necessary. > > > > If we do decide to support it, the strategy for finding the optimal > > partition in adaptive shuffle() would be as follows: perform random > > selection for the number of times specified by > > taskmanager.network.adaptive-partitioner.max-traverse-size, record the > list > > of partition candidates, and then choose, as the final downstream task to > > write data to, the task whose downstream data partition currently has the > > least buffered data. > > > > Looking forward to your comments. > > > > [1] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducetheAdaptiveLoadBasedRecordWriter > > > > > > Best regards, > > Yuepeng Pan > > > > > > > > > > Zhanghao Chen <[email protected]> 于2026年1月16日周五 10:49写道: > > > >> Thanks Yuepeng and Rui for the discussion. Agree with Rui that we should > >> exclude customPartition as it may bound to the specific downstream > subtask > >> depending on the implementation. I'm also hesitated on whether we should > >> cover shuffle as well. Shuffle partitioner just randomly chooses a > >> downstream task, and the idea of adaptivity breaks the randomness. I'm > not > >> sure if any one would ever rely on the randomness, looking forward to > >> community ideas on it. > >> > >> Best, > >> Zhanghao Chen > >> ________________________________ > >> From: Rui Fan <[email protected]> > >> Sent: Thursday, January 15, 2026 16:31 > >> To: [email protected] <[email protected]> > >> Subject: Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection > for > >> StreamPartitioner > >> > >> Thanks Yuepeng for the update. > >> > >>> For [(Potential)Data not bound to subtask] scenarios (4 partition > >> strategies: *Rebalance, Rescale, Shuffle, customPartition*), > >>> adjusting dynamically the subtask of the received data according to the > >> processing load of the downstream operator would be > >>> good to achieve the effect of peak-shaving and valley-filling, and try > to > >> ensure the throughput of flink jobs. > >> > >> IIUC, the customPartition is bound to the specific downstream subtask, > >> right? > >> > >> Also, could you please explain in FLIP how to find the suitable > downstream > >> subtask for different partitioners? > >> Specifically > >> > >> Best, > >> Rui > >> > >> On Thu, Jan 15, 2026 at 7:37 AM Yuepeng Pan <[email protected]> > >> wrote: > >> > >>> Hi, devs. > >>> > >>> FYI about the review comments from Mang Zhang(Thanks): > >>> > >>>> Hi, Yuepeng, > >>>> > >>>> Thanks! > >>>> > >>>> Regarding FLIP-339, the current design is exactly what I originally > >>> envisioned. > >>>> Whether using Flink JAR or SQL, the cost to users is low, and it is > more > >>> versatile. > >>>> LGTM > >>>> > >>> > >>>> -- > >>> > >>>> Best regards, > >>> > >>>> Mang Zhang > >>> > >>> > >>> Best regards, > >>> > >>> Yuepeng Pan > >>> > >>> > >>> > >>> Yuepeng Pan <[email protected]> 于2025年12月16日周二 12:02写道: > >>> > >>>> Thanks Mang for the review. > >>>> > >>>> Since the names of the subtitles are somewhat ambiguous and may lead > to > >>>> misunderstandings about not preserving the original interface, > >>>> I have made some corresponding change[1]. > >>>> > >>>> Hope this helps. Please take a look if you had the free time. > >>>> Thank you. > >>>> > >>>> [1] > >>>> > >>> > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducethenewmethodsatAPIlevel > >>>> : > >>>> > >>>> Best regards, > >>>> Yuepeng Pan > >>>> > >>>> Mang Zhang <[email protected]> 于2025年12月16日周二 10:53写道: > >>>> > >>>>> hi Yuepeng > >>>>> Thank you for continuing to drive this FLIP forward. > >>>>> Regarding the changes to the DataStream API in FLIP, I haven't > >> observed > >>>>> any forward compatibility with the original API. Could you explain > the > >>>>> rationale behind this design choice? > >>>>> Forward-compatible APIs reduce the cost for users to upgrade Flink > >>>>> versions. > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> Best regards, > >>>>> Mang Zhang > >>>>> > >>>>> > >>>>> > >>>>> At 2025-12-15 12:40:00, "Pan Yuepeng" <[email protected]> > wrote: > >>>>>> Hi devs, > >>>>>> > >>>>>> I re-sorted out and supplemented the 'FLIP-339[1] Support Adaptive > >>>>>> Partition Selection for StreamPartitioner' based on Flink JIRA[2]. > >>>>>> > >>>>>> Flink offers multiple partition strategies, some of which bind data > >> to > >>>>>> downstream subtasks, while others do not (e.g., shuffle, rescale, > >>>>>> rebalance). > >>>>>> For [Data not bound to subtasks] scenarios, overloaded > sub-task-nodes > >>> may > >>>>>> slow down the processing of Flink jobs, leading to backpressure and > >>> data > >>>>>> lag. Dynamically adjusting the partition of data to subtasks based > on > >>> the > >>>>>> processing load of downstream operators helps achieve a peak-shaving > >>> and > >>>>>> valley-filling effect, thereby striving to maintain the throughput > of > >>>>> Flink > >>>>>> jobs. > >>>>>> > >>>>>> The raw discussions could be found in the Flink JIRA[2]. > >>>>>> I really appreciate developers involved in the discussion for the > >>>>> valuable > >>>>>> help and suggestions in advance. > >>>>>> > >>>>>> Please refer to the FLIP[1] wiki for more details about the proposed > >>>>> design > >>>>>> and implementation. > >>>>>> > >>>>>> Welcome any feedback and opinions on this proposal. > >>>>>> > >>>>>> [1] https://cwiki.apache.org/confluence/x/nYyzDw > >>>>>> [2] https://issues.apache.org/jira/browse/FLINK-31655 > >>>>>> > >>>>>> Best regards, > >>>>>> Yuepeng Pan > >>>>> > >>>> > >>> > >> > >
