Hi Timo,

Thanks for this FLIP, all three improvements address real production needs.
One question on the broadcast state design:
The FLIP describes NOTIFY_STATEFUL_SETS as re-invoking eval() for each
existing stateful set with a set key context. The collect() prohibition is
scoped to broadcast-state-only processing, so I assume collect() is
permitted during the per-key NOTIFY re-evaluation, is that correct? If so,
it enables a useful pattern: broadcast a rule change and immediately
re-emit corrected results across all existing keys (accepting the cost of
full key iteration).

Best,
Natea

On Fri, Mar 6, 2026 at 1:13 AM Timo Walther <[email protected]> wrote:

> Hi everyone,
>
> if there are not objections, I would start a VOTE on Monday.
>
> Thanks,
> Timo
>
> On 05.03.26 10:02, Gustavo de Morais wrote:
> > Hi Timo,
> >
> > Thank you for proposing these improvements. All address real pain points,
> >   so +1. It's especially good to see BROADCAST_SEMANTIC_TABLE. This
> unlocks
> > a set of use cases for use cases involving small lookup tables that can
> be
> > considerably optimized. I'm also +1 on supporting ORDER BY instead of an
> > additional argument trait.
> >
> > Thanks for continuing to push PTFs forward - they are becoming really
> > powerful.
> >
> > Kind regards,
> > Gustavo
> >
> > On Wed, 4 Mar 2026 at 16:40, Ryan van Huuksloot via dev <
> > [email protected]> wrote:
> >
> >> That makes sense to me. First make it work; then, make it easy.
> >>
> >> Otherwise the FLIP looks good to me. Some great improvements! Thanks for
> >> putting this together.
> >>
> >> Ryan van Huuksloot
> >> Staff Engineer, Infrastructure | Streaming Platform
> >> [image: Shopify]
> >> <
> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email>
> >>
> >>
> >> On Wed, Mar 4, 2026 at 9:22 AM Timo Walther <[email protected]> wrote:
> >>
> >>> Hi Ryan,
> >>>
> >>> thanks for the great feedback. I agree that some parts might still be
> >>> too complex, usability is definitely a continuous effort. For now, the
> >>> main goal of PTFs was to unblock people when something cannot be
> >>> expressed with SQL or would lead to very inefficient query plans. Also
> >>> they rather target a developer persona. Usually, a platform team that
> >>> develops PTFs for SQL personas. In the mid-term, I hope that AI will
> >>> implement most of the PTFs. So exposing engine primitives / building
> >>> blocks for AI is crucial.
> >>>
> >>> Maybe we can also offer a SimpleProcessFunction at some point, once we
> >>> know better why and how people use PTFs. Also having more built-in PTFs
> >>> that address the most frequent tasks can be very helpful.
> >>>
> >>> Please continue sharing your experiences: What are frequent tasks? What
> >>> do users want to achieve with PTFs?
> >>>
> >>> Cheers,
> >>> Timo
> >>>
> >>> On 03.03.26 21:09, Ryan van Huuksloot via dev wrote:
> >>>> Hi Timo,
> >>>>
> >>>> Thanks for the FLIP.
> >>>>
> >>>> Internally, we've started using PTFs and are still figuring out how to
> >>> best
> >>>> leverage them.
> >>>> The improvements you proposed in your FLIP are great.
> >>>> I wanted to mention the priority order for the 3 improvements you've
> >>>> recommended. I would prioritize them in the order you stated, based on
> >>> our
> >>>> usage. So far I haven't had any broadcast requests but I'm sure
> they're
> >>>> coming. The late arriving data will be very helpful.
> >>>>
> >>>> My primary concern with PTFs and large state is generally the
> >> complexity
> >>> of
> >>>> the state decisions. Most of our SQL developers won't understand when
> >> to
> >>>> use a "[Map][List][Value]View" with a PTF. Specifically this area in
> >> the
> >>>> documentation:
> >>>>
> >>>
> >>
> https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/functions/ptfs/#large-state
> >>>> You really need to understand Java concepts to grasp the intricacies
> of
> >>>> your decisions when choosing a state mechanism. I wonder if we can
> >>> simplify
> >>>> this decision for engineers who may not be Flink and Java experts. It
> >> may
> >>>> not be possible.
> >>>>
> >>>> Ryan van Huuksloot
> >>>> Staff Engineer, Infrastructure | Streaming Platform
> >>>> [image: Shopify]
> >>>> <
> >> https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Mar 3, 2026 at 3:47 AM Timo Walther <[email protected]>
> >> wrote:
> >>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> Just bumping this thread again and happy to gather any feedback you
> >>> have.
> >>>>>
> >>>>> Thanks,
> >>>>> Timo
> >>>>>
> >>>>> On 16.02.26 09:35, Timo Walther wrote:
> >>>>>> Hi everyone,
> >>>>>>
> >>>>>> the ProcessTableFunction (PTF) feature has been well received by the
> >>>>>> Flink community and its adoption is increasing. Since FLIP-440 [1]
> >>>>>> introduced a lot of new API and new concepts, some design decisions
> >>> need
> >>>>>> smaller adjustments along late data handling and lazy state access.
> >>>>>>
> >>>>>> Also, talking to community members at Current and Flink Forward
> >>>>>> conferences has shown that broadcast state is crucial to bridge the
> >> gap
> >>>>>> to DataStream API applications for broadcast joining and rule-based
> >>>>> logic.
> >>>>>>
> >>>>>> I would like to propose FLIP-565: Improve ProcessTableFunctions for
> >>> late
> >>>>>> data handling and state access" [2].
> >>>>>>
> >>>>>> This FLIP proposes 3 important PTF improvements:
> >>>>>>
> >>>>>> 1) Don’t drop late data in ProcessFunction as data-loss is usually
> >> not
> >>>>>> intended; similar to DataStream API’s ProcessFunction
> >>>>>>
> >>>>>> 2) Introduce ValueView to enable a “supplier”-pattern for state
> >> access;
> >>>>>> similar to MapView and ListView
> >>>>>>
> >>>>>> 3) Introduce BROADCAST_SEMANTIC_TABLE as a new kind of argument to
> >> PTFs
> >>>>>>
> >>>>>> Regarding forward compatibility, all proposed items can be made
> >>>>>> available in batch mode eventually for a unified experience. From my
> >>>>>> point of view, these remaining adjustments should make PTF fully
> >>>>>> production ready, I don't expect any major additions in the
> mid-term.
> >>>>>>
> >>>>>> Looking forward to your feedback.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Timo
> >>>>>>
> >>>>>> [1] https://cwiki.apache.org/confluence/x/pQnPEQ
> >>>>>> [2] https://cwiki.apache.org/confluence/x/qIo8G
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Reply via email to