Hi Krzysztof,

There is a difference in semantics here between yourself and Caizhi. SQL
UDFs can be used statefully - see AggregateFunction and
TableAggregateFunction for examples. You even have access to ListView and
MapView which are backed by ListState and MapState accordingly. These
functions contain aggregates which do participate in checkpointing and are
strongly consistent.

What is not supported are more low level process function type operations
(custom state registration, user access to timers, broadcast state as
you've discovered). There have been some discussions about how to add this
sort of functionality in a SQL compliant manner but nothing concrete.

In the meantime, Flink SQL has strong interop with the DataStream API. You
can always transform a Table into a DataStream, do some low level
processing, and then transform it back into a table to run further SQL.

Seth

On Wed, Dec 15, 2021 at 3:52 AM Krzysztof Chmielewski <
krzysiek.chmielew...@gmail.com> wrote:

> Thank you,
> yes I was thinking about simply running my own thread in UDF and consume
> some queue something like that.
> Having some background with DataStreamAPI i was hoping that I can reuse
> same mechanisms (like Broadcast State Pattern or CoProcessFunction) in
> Flink SQL.
> However it seems there is a quite noticeable gap between what you can do
> with SQL and what you can control comparing to DataStreamAPI.
>
> Regarding UDF being stateless. I assume you mean that UDF does not
> participate in checkpoint mechanism and I cannot initialize a Flink state
> in UDF right?
> I'm wondering why it is not possible. Seems like an "obligatory" feature
> for Statefull stream processing platform that supports SQL. In a time when
> there is a huge interest with Flink SQL and everyone is talking about it,
> thing like state support is not available in SQL UDF is suprissing.
>
> Are there any plans, maybe FLIP to change it?
>
> Regards,
> Krzysztof Chmielewski
>
> śr., 15 gru 2021 o 02:36 Caizhi Weng <tsreape...@gmail.com> napisał(a):
>
>> Hi!
>>
>> Currently you can't use broadcast state in Flink SQL UDF because UDFs are
>> all stateless.
>>
>> However you mentioned your use case that you want to control the logic in
>> UDF with some information. If that is the case, you can just run a thread
>> in your UDF to read that information and change the behavior of the eval
>> method accordingly.
>>
>> Krzysztof Chmielewski <krzysiek.chmielew...@gmail.com> 于2021年12月15日周三
>> 05:47写道:
>>
>>> Hi,
>>> Is there a way to build an UDF [1] for FLink SQL that can be used with
>>> Broadcast State Pattern [2]?
>>>
>>> I have a use case, where I would like to be able to use broadcast
>>> control stream to change logic in UDF.
>>>
>>> Regards,
>>> Krzysztof Chmielewski
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/udfs/#user-defined-functions
>>> [2]
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/broadcast_state/#the-broadcast-state-pattern
>>>
>>

Reply via email to