Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection for StreamPartitioner

Yuepeng Pan Thu, 15 Jan 2026 19:33:29 -0800

Thanks Rui Ran and Zhanghao Chen for the comments.

> Agree with Rui that we should exclude customPartition as it may bound to
the specific downstream subtask depending on the implementation.
> IIUC, the customPartition is bound to the specific downstream subtask,
right?


Yes, You're right. Sorry for the dirt from the previous design. I deleted
the related description from the main design.

> [From Hao]:  I'm also hesitated on whether we should cover shuffle as
well. Shuffle partitioner just
>  randomly chooses a downstream task, and the idea of adaptivity breaks
the randomness.
>  I'm not sure if any one would ever rely on the randomness, looking
forward to community ideas on it.

> [From Rui]: could you please explain in FLIP how to find the suitable
downstream
> subtask for different partitioners?

In theory, the adaptive modes of RescalePartitioner and
RebalancePartitioner can share the same logic when searching for the
optimal partition, as shown in [1].

Regarding whether to support the adaptive shuffle() partitioning mode, I
take a neutral stance—either supporting it or not is acceptable. One of the
main reasons is that we should try to avoid introducing yet another
parameter to control whether adaptivity is enabled for shuffle(), since we
already have quite a few parameters. Introducing a new parameter only to
support a small number of scenarios without sufficient user feedback does
not seem necessary.

If we do decide to support it, the strategy for finding the optimal
partition in adaptive shuffle() would be as follows: perform random
selection for the number of times specified by
taskmanager.network.adaptive-partitioner.max-traverse-size, record the list
of partition candidates, and then choose, as the final downstream task to
write data to, the task whose downstream data partition currently has the
least buffered data.

Looking forward to your comments.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducetheAdaptiveLoadBasedRecordWriter


Best regards,
Yuepeng Pan




Zhanghao Chen <[email protected]> 于2026年1月16日周五 10:49写道：

> Thanks Yuepeng and Rui for the discussion. Agree with Rui that we should
> exclude customPartition as it may bound to the specific downstream subtask
> depending on the implementation. I'm also hesitated on whether we should
> cover shuffle as well. Shuffle partitioner just randomly chooses a
> downstream task, and the idea of adaptivity breaks the randomness. I'm not
> sure if any one would ever rely on the randomness, looking forward to
> community ideas on it.
>
> Best,
> Zhanghao Chen
> ________________________________
> From: Rui Fan <[email protected]>
> Sent: Thursday, January 15, 2026 16:31
> To: [email protected] <[email protected]>
> Subject: Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection for
> StreamPartitioner
>
> Thanks Yuepeng for the update.
>
> > For [(Potential)Data not bound to subtask] scenarios (4 partition
> strategies: *Rebalance, Rescale, Shuffle, customPartition*),
> > adjusting dynamically the subtask of the received data according to the
> processing load of the downstream operator would be
> > good to achieve the effect of peak-shaving and valley-filling, and try to
> ensure the throughput of flink jobs.
>
> IIUC, the customPartition is bound to the specific downstream subtask,
> right?
>
> Also, could you please explain in FLIP how to find the suitable downstream
> subtask for different partitioners?
> Specifically
>
> Best,
> Rui
>
> On Thu, Jan 15, 2026 at 7:37 AM Yuepeng Pan <[email protected]>
> wrote:
>
> > Hi, devs.
> >
> > FYI about the review comments from Mang Zhang(Thanks):
> >
> > >Hi, Yuepeng,
> > >
> > >Thanks!
> > >
> > >Regarding FLIP-339, the current design is exactly what I originally
> > envisioned.
> > >Whether using Flink JAR or SQL, the cost to users is low, and it is more
> > versatile.
> > >LGTM
> > >
> >
> > >--
> >
> > >Best regards,
> >
> > >Mang Zhang
> >
> >
> > Best regards,
> >
> > Yuepeng Pan
> >
> >
> >
> > Yuepeng Pan <[email protected]> 于2025年12月16日周二 12:02写道：
> >
> > > Thanks Mang for the review.
> > >
> > > Since the names of the subtitles are somewhat ambiguous and may lead to
> > > misunderstandings about not preserving the original interface,
> > > I have made some corresponding change[1].
> > >
> > > Hope this helps. Please take a look if you had the free time.
> > > Thank you.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425181#FLIP339:SupportAdaptivePartitionSelectionforStreamPartitioner-IntroducethenewmethodsatAPIlevel
> > > :
> > >
> > > Best regards,
> > > Yuepeng Pan
> > >
> > > Mang Zhang <[email protected]> 于2025年12月16日周二 10:53写道：
> > >
> > >> hi Yuepeng
> > >> Thank you for continuing to drive this FLIP forward.
> > >> Regarding the changes to the DataStream API in FLIP, I haven't
> observed
> > >> any forward compatibility with the original API. Could you explain the
> > >> rationale behind this design choice?
> > >> Forward-compatible APIs reduce the cost for users to upgrade Flink
> > >> versions.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Best regards,
> > >> Mang Zhang
> > >>
> > >>
> > >>
> > >> At 2025-12-15 12:40:00, "Pan Yuepeng" <[email protected]> wrote:
> > >> >Hi devs,
> > >> >
> > >> >I re-sorted out and supplemented the 'FLIP-339[1] Support Adaptive
> > >> >Partition Selection for StreamPartitioner' based on Flink JIRA[2].
> > >> >
> > >> >Flink offers multiple partition strategies, some of which bind data
> to
> > >> >downstream subtasks, while others do not (e.g., shuffle, rescale,
> > >> >rebalance).
> > >> >For [Data not bound to subtasks] scenarios, overloaded sub-task-nodes
> > may
> > >> >slow down the processing of Flink jobs, leading to backpressure and
> > data
> > >> >lag. Dynamically adjusting the partition of data to subtasks based on
> > the
> > >> >processing load of downstream operators helps achieve a peak-shaving
> > and
> > >> >valley-filling effect, thereby striving to maintain the throughput of
> > >> Flink
> > >> >jobs.
> > >> >
> > >> >The raw discussions could be found in the Flink JIRA[2].
> > >> >I really appreciate developers involved in the discussion for the
> > >> valuable
> > >> >help and suggestions in advance.
> > >> >
> > >> >Please refer to the FLIP[1] wiki for more details about the proposed
> > >> design
> > >> >and implementation.
> > >> >
> > >> >Welcome any feedback and opinions on this proposal.
> > >> >
> > >> >[1] https://cwiki.apache.org/confluence/x/nYyzDw
> > >> >[2] https://issues.apache.org/jira/browse/FLINK-31655
> > >> >
> > >> >Best regards,
> > >> >Yuepeng Pan
> > >>
> > >
> >
>

Re: [DISCUSS] FLIP-339: Support Adaptive Partition Selection for StreamPartitioner

Reply via email to