Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection for StreamPartitioner

Rui Fan Thu, 15 Jan 2026 22:50:08 -0800

Thanks everyone for the valuable discussion!



On Fri, Jan 16, 2026 at 7:08 AM yuanfeng hu <[email protected]> wrote:

> Thanks for the quick update and the ongoing discussion.
>
> I would like to +1 for focusing only on rebalance and rescale and
> excluding shuffle from the adaptive selection logic for now.
>
> My reasoning is that when a user explicitly chooses the shuffle()
> partitioner, it often implies a requirement for statistical randomness.
> Introducing adaptive logic based on downstream load would inherently break
> this randomness. In certain scenarios, such as Machine Learning (ML) data
> preparation or specific statistical sampling tasks, the uniform randomness
> provided by shuffle is a functional requirement rather than just a
> distribution strategy.
>
> By only supporting rebalance and rescale, we respect the user's explicit
> intent for those specific operators while avoiding potential side effects
> in workloads that rely on the stochastic nature of shuffling.
>
>
>
> Best regards,
>
> Yuanfeng
>
>
> > 2026年1月16日 11:30，Yuepeng Pan <[email protected]> 写道：
> >
> > Thanks Rui Ran and Zhanghao Chen for the comments.
> >
> >> Agree with Rui that we should exclude customPartition as it may bound to
> > the specific downstream subtask depending on the implementation.
> >> IIUC, the customPartition is bound to the specific downstream subtask,
> > right?
> >
> > Yes, You're right. Sorry for the dirt from the previous design. I deleted
> > the related description from the main design.
> >
> >> [From Hao]:  I'm also hesitated on whether we should cover shuffle as
> > well. Shuffle partitioner just
> >> randomly chooses a downstream task, and the idea of adaptivity breaks
> > the randomness.
> >> I'm not sure if any one would ever rely on the randomness, looking
> > forward to community ideas on it.
> >
> >> [From Rui]: could you please explain in FLIP how to find the suitable
> > downstream
> >> subtask for different partitioners?
> >
> > In theory, the adaptive modes of RescalePartitioner and
> > RebalancePartitioner can share the same logic when searching for the
> > optimal partition, as shown in [1].
> >
> > Regarding whether to support the adaptive shuffle() partitioning mode, I
> > take a neutral stance—either supporting it or not is acceptable. One of
> the
> > main reasons is that we should try to avoid introducing yet another
> > parameter to control whether adaptivity is enabled for shuffle(), since
> we
> > already have quite a few parameters. Introducing a new parameter only to
> > support a small number of scenarios without sufficient user feedback does
> > not seem necessary.
> >
> > If we do decide to support it, the strategy for finding the optimal
> > partition in adaptive shuffle() would be as follows: perform random
> > selection for the number of times specified by
> > taskmanager.network.adaptive-partitioner.max-traverse-size, record the
> list
> > of partition candidates, and then choose, as the final downstream task to
> > write data to, the task whose downstream data partition currently has the
> > least buffered data.
> >
> > Looking forward to your comments.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducetheAdaptiveLoadBasedRecordWriter
> >
> >
> > Best regards,
> > Yuepeng Pan
> >
> >
> >
> >
> > Zhanghao Chen <[email protected]> 于2026年1月16日周五 10:49写道：
> >
> >> Thanks Yuepeng and Rui for the discussion. Agree with Rui that we should
> >> exclude customPartition as it may bound to the specific downstream
> subtask
> >> depending on the implementation. I'm also hesitated on whether we should
> >> cover shuffle as well. Shuffle partitioner just randomly chooses a
> >> downstream task, and the idea of adaptivity breaks the randomness. I'm
> not
> >> sure if any one would ever rely on the randomness, looking forward to
> >> community ideas on it.
> >>
> >> Best,
> >> Zhanghao Chen
> >> ________________________________
> >> From: Rui Fan <[email protected]>
> >> Sent: Thursday, January 15, 2026 16:31
> >> To: [email protected] <[email protected]>
> >> Subject: Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection
> for
> >> StreamPartitioner
> >>
> >> Thanks Yuepeng for the update.
> >>
> >>> For [(Potential)Data not bound to subtask] scenarios (4 partition
> >> strategies: *Rebalance, Rescale, Shuffle, customPartition*),
> >>> adjusting dynamically the subtask of the received data according to the
> >> processing load of the downstream operator would be
> >>> good to achieve the effect of peak-shaving and valley-filling, and try
> to
> >> ensure the throughput of flink jobs.
> >>
> >> IIUC, the customPartition is bound to the specific downstream subtask,
> >> right?
> >>
> >> Also, could you please explain in FLIP how to find the suitable
> downstream
> >> subtask for different partitioners?
> >> Specifically
> >>
> >> Best,
> >> Rui
> >>
> >> On Thu, Jan 15, 2026 at 7:37 AM Yuepeng Pan <[email protected]>
> >> wrote:
> >>
> >>> Hi, devs.
> >>>
> >>> FYI about the review comments from Mang Zhang(Thanks):
> >>>
> >>>> Hi, Yuepeng,
> >>>>
> >>>> Thanks!
> >>>>
> >>>> Regarding FLIP-339, the current design is exactly what I originally
> >>> envisioned.
> >>>> Whether using Flink JAR or SQL, the cost to users is low, and it is
> more
> >>> versatile.
> >>>> LGTM
> >>>>
> >>>
> >>>> --
> >>>
> >>>> Best regards,
> >>>
> >>>> Mang Zhang
> >>>
> >>>
> >>> Best regards,
> >>>
> >>> Yuepeng Pan
> >>>
> >>>
> >>>
> >>> Yuepeng Pan <[email protected]> 于2025年12月16日周二 12:02写道：
> >>>
> >>>> Thanks Mang for the review.
> >>>>
> >>>> Since the names of the subtitles are somewhat ambiguous and may lead
> to
> >>>> misunderstandings about not preserving the original interface,
> >>>> I have made some corresponding change[1].
> >>>>
> >>>> Hope this helps. Please take a look if you had the free time.
> >>>> Thank you.
> >>>>
> >>>> [1]
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducethenewmethodsatAPIlevel
> >>>> :
> >>>>
> >>>> Best regards,
> >>>> Yuepeng Pan
> >>>>
> >>>> Mang Zhang <[email protected]> 于2025年12月16日周二 10:53写道：
> >>>>
> >>>>> hi Yuepeng
> >>>>> Thank you for continuing to drive this FLIP forward.
> >>>>> Regarding the changes to the DataStream API in FLIP, I haven't
> >> observed
> >>>>> any forward compatibility with the original API. Could you explain
> the
> >>>>> rationale behind this design choice?
> >>>>> Forward-compatible APIs reduce the cost for users to upgrade Flink
> >>>>> versions.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Best regards,
> >>>>> Mang Zhang
> >>>>>
> >>>>>
> >>>>>
> >>>>> At 2025-12-15 12:40:00, "Pan Yuepeng" <[email protected]>
> wrote:
> >>>>>> Hi devs,
> >>>>>>
> >>>>>> I re-sorted out and supplemented the 'FLIP-339[1] Support Adaptive
> >>>>>> Partition Selection for StreamPartitioner' based on Flink JIRA[2].
> >>>>>>
> >>>>>> Flink offers multiple partition strategies, some of which bind data
> >> to
> >>>>>> downstream subtasks, while others do not (e.g., shuffle, rescale,
> >>>>>> rebalance).
> >>>>>> For [Data not bound to subtasks] scenarios, overloaded
> sub-task-nodes
> >>> may
> >>>>>> slow down the processing of Flink jobs, leading to backpressure and
> >>> data
> >>>>>> lag. Dynamically adjusting the partition of data to subtasks based
> on
> >>> the
> >>>>>> processing load of downstream operators helps achieve a peak-shaving
> >>> and
> >>>>>> valley-filling effect, thereby striving to maintain the throughput
> of
> >>>>> Flink
> >>>>>> jobs.
> >>>>>>
> >>>>>> The raw discussions could be found in the Flink JIRA[2].
> >>>>>> I really appreciate developers involved in the discussion for the
> >>>>> valuable
> >>>>>> help and suggestions in advance.
> >>>>>>
> >>>>>> Please refer to the FLIP[1] wiki for more details about the proposed
> >>>>> design
> >>>>>> and implementation.
> >>>>>>
> >>>>>> Welcome any feedback and opinions on this proposal.
> >>>>>>
> >>>>>> [1] https://cwiki.apache.org/confluence/x/nYyzDw
> >>>>>> [2] https://issues.apache.org/jira/browse/FLINK-31655
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Yuepeng Pan
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection for StreamPartitioner

Reply via email to