Thanks everyone for the valuable discussion! Sounds make sense to only support rebalance and rescale in this FLIP.
Best, Rui On Fri, Jan 16, 2026 at 7:49 AM Rui Fan <[email protected]> wrote: > Thanks everyone for the valuable discussion! > > > > On Fri, Jan 16, 2026 at 7:08 AM yuanfeng hu <[email protected]> > wrote: > >> Thanks for the quick update and the ongoing discussion. >> >> I would like to +1 for focusing only on rebalance and rescale and >> excluding shuffle from the adaptive selection logic for now. >> >> My reasoning is that when a user explicitly chooses the shuffle() >> partitioner, it often implies a requirement for statistical randomness. >> Introducing adaptive logic based on downstream load would inherently break >> this randomness. In certain scenarios, such as Machine Learning (ML) data >> preparation or specific statistical sampling tasks, the uniform randomness >> provided by shuffle is a functional requirement rather than just a >> distribution strategy. >> >> By only supporting rebalance and rescale, we respect the user's explicit >> intent for those specific operators while avoiding potential side effects >> in workloads that rely on the stochastic nature of shuffling. >> >> >> >> Best regards, >> >> Yuanfeng >> >> >> > 2026年1月16日 11:30,Yuepeng Pan <[email protected]> 写道: >> > >> > Thanks Rui Ran and Zhanghao Chen for the comments. >> > >> >> Agree with Rui that we should exclude customPartition as it may bound >> to >> > the specific downstream subtask depending on the implementation. >> >> IIUC, the customPartition is bound to the specific downstream subtask, >> > right? >> > >> > Yes, You're right. Sorry for the dirt from the previous design. I >> deleted >> > the related description from the main design. >> > >> >> [From Hao]: I'm also hesitated on whether we should cover shuffle as >> > well. Shuffle partitioner just >> >> randomly chooses a downstream task, and the idea of adaptivity breaks >> > the randomness. >> >> I'm not sure if any one would ever rely on the randomness, looking >> > forward to community ideas on it. >> > >> >> [From Rui]: could you please explain in FLIP how to find the suitable >> > downstream >> >> subtask for different partitioners? >> > >> > In theory, the adaptive modes of RescalePartitioner and >> > RebalancePartitioner can share the same logic when searching for the >> > optimal partition, as shown in [1]. >> > >> > Regarding whether to support the adaptive shuffle() partitioning mode, I >> > take a neutral stance—either supporting it or not is acceptable. One of >> the >> > main reasons is that we should try to avoid introducing yet another >> > parameter to control whether adaptivity is enabled for shuffle(), since >> we >> > already have quite a few parameters. Introducing a new parameter only to >> > support a small number of scenarios without sufficient user feedback >> does >> > not seem necessary. >> > >> > If we do decide to support it, the strategy for finding the optimal >> > partition in adaptive shuffle() would be as follows: perform random >> > selection for the number of times specified by >> > taskmanager.network.adaptive-partitioner.max-traverse-size, record the >> list >> > of partition candidates, and then choose, as the final downstream task >> to >> > write data to, the task whose downstream data partition currently has >> the >> > least buffered data. >> > >> > Looking forward to your comments. >> > >> > [1] >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducetheAdaptiveLoadBasedRecordWriter >> > >> > >> > Best regards, >> > Yuepeng Pan >> > >> > >> > >> > >> > Zhanghao Chen <[email protected]> 于2026年1月16日周五 10:49写道: >> > >> >> Thanks Yuepeng and Rui for the discussion. Agree with Rui that we >> should >> >> exclude customPartition as it may bound to the specific downstream >> subtask >> >> depending on the implementation. I'm also hesitated on whether we >> should >> >> cover shuffle as well. Shuffle partitioner just randomly chooses a >> >> downstream task, and the idea of adaptivity breaks the randomness. I'm >> not >> >> sure if any one would ever rely on the randomness, looking forward to >> >> community ideas on it. >> >> >> >> Best, >> >> Zhanghao Chen >> >> ________________________________ >> >> From: Rui Fan <[email protected]> >> >> Sent: Thursday, January 15, 2026 16:31 >> >> To: [email protected] <[email protected]> >> >> Subject: Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection >> for >> >> StreamPartitioner >> >> >> >> Thanks Yuepeng for the update. >> >> >> >>> For [(Potential)Data not bound to subtask] scenarios (4 partition >> >> strategies: *Rebalance, Rescale, Shuffle, customPartition*), >> >>> adjusting dynamically the subtask of the received data according to >> the >> >> processing load of the downstream operator would be >> >>> good to achieve the effect of peak-shaving and valley-filling, and >> try to >> >> ensure the throughput of flink jobs. >> >> >> >> IIUC, the customPartition is bound to the specific downstream subtask, >> >> right? >> >> >> >> Also, could you please explain in FLIP how to find the suitable >> downstream >> >> subtask for different partitioners? >> >> Specifically >> >> >> >> Best, >> >> Rui >> >> >> >> On Thu, Jan 15, 2026 at 7:37 AM Yuepeng Pan <[email protected]> >> >> wrote: >> >> >> >>> Hi, devs. >> >>> >> >>> FYI about the review comments from Mang Zhang(Thanks): >> >>> >> >>>> Hi, Yuepeng, >> >>>> >> >>>> Thanks! >> >>>> >> >>>> Regarding FLIP-339, the current design is exactly what I originally >> >>> envisioned. >> >>>> Whether using Flink JAR or SQL, the cost to users is low, and it is >> more >> >>> versatile. >> >>>> LGTM >> >>>> >> >>> >> >>>> -- >> >>> >> >>>> Best regards, >> >>> >> >>>> Mang Zhang >> >>> >> >>> >> >>> Best regards, >> >>> >> >>> Yuepeng Pan >> >>> >> >>> >> >>> >> >>> Yuepeng Pan <[email protected]> 于2025年12月16日周二 12:02写道: >> >>> >> >>>> Thanks Mang for the review. >> >>>> >> >>>> Since the names of the subtitles are somewhat ambiguous and may lead >> to >> >>>> misunderstandings about not preserving the original interface, >> >>>> I have made some corresponding change[1]. >> >>>> >> >>>> Hope this helps. Please take a look if you had the free time. >> >>>> Thank you. >> >>>> >> >>>> [1] >> >>>> >> >>> >> >> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducethenewmethodsatAPIlevel >> >>>> : >> >>>> >> >>>> Best regards, >> >>>> Yuepeng Pan >> >>>> >> >>>> Mang Zhang <[email protected]> 于2025年12月16日周二 10:53写道: >> >>>> >> >>>>> hi Yuepeng >> >>>>> Thank you for continuing to drive this FLIP forward. >> >>>>> Regarding the changes to the DataStream API in FLIP, I haven't >> >> observed >> >>>>> any forward compatibility with the original API. Could you explain >> the >> >>>>> rationale behind this design choice? >> >>>>> Forward-compatible APIs reduce the cost for users to upgrade Flink >> >>>>> versions. >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> >> >>>>> Best regards, >> >>>>> Mang Zhang >> >>>>> >> >>>>> >> >>>>> >> >>>>> At 2025-12-15 12:40:00, "Pan Yuepeng" <[email protected]> >> wrote: >> >>>>>> Hi devs, >> >>>>>> >> >>>>>> I re-sorted out and supplemented the 'FLIP-339[1] Support Adaptive >> >>>>>> Partition Selection for StreamPartitioner' based on Flink JIRA[2]. >> >>>>>> >> >>>>>> Flink offers multiple partition strategies, some of which bind data >> >> to >> >>>>>> downstream subtasks, while others do not (e.g., shuffle, rescale, >> >>>>>> rebalance). >> >>>>>> For [Data not bound to subtasks] scenarios, overloaded >> sub-task-nodes >> >>> may >> >>>>>> slow down the processing of Flink jobs, leading to backpressure and >> >>> data >> >>>>>> lag. Dynamically adjusting the partition of data to subtasks based >> on >> >>> the >> >>>>>> processing load of downstream operators helps achieve a >> peak-shaving >> >>> and >> >>>>>> valley-filling effect, thereby striving to maintain the throughput >> of >> >>>>> Flink >> >>>>>> jobs. >> >>>>>> >> >>>>>> The raw discussions could be found in the Flink JIRA[2]. >> >>>>>> I really appreciate developers involved in the discussion for the >> >>>>> valuable >> >>>>>> help and suggestions in advance. >> >>>>>> >> >>>>>> Please refer to the FLIP[1] wiki for more details about the >> proposed >> >>>>> design >> >>>>>> and implementation. >> >>>>>> >> >>>>>> Welcome any feedback and opinions on this proposal. >> >>>>>> >> >>>>>> [1] https://cwiki.apache.org/confluence/x/nYyzDw >> >>>>>> [2] https://issues.apache.org/jira/browse/FLINK-31655 >> >>>>>> >> >>>>>> Best regards, >> >>>>>> Yuepeng Pan >> >>>>> >> >>>> >> >>> >> >> >> >>
