Re: A couple of question for Stateful Functions

Dan Pettersson Wed, 02 Sep 2020 07:49:05 -0700

Thanks for your quick reply.

/Dan


Den ons 2 sep. 2020 kl 12:24 skrev Igal Shilman <i...@ververica.com>:

> Hi Dan, let me try to answer your questions:
>
>
>> I guess my question is if one can
>> freely mix Flink core with SF's code with regards to performance,
>> fault-tolerance, and checkpointing?
>
>
> The main limitations at the moment is that, currently SF requires a
> processing time watermark semantics only, event time is not supported as it
> is difficult to reason about completion in the presence of loops.
> Other than that in respect to fault-tolerance and checkpointing StateFun
> is built on top of the DataStream API, so the same gurantines applies to SF
> as-well.
>
>
>> I'm unable to use Protobuf so POJO's is going to be processed. Is there a
>> large performance impact of using Pojos instead of protos in SF?
>
>
> The only supported message data type for SF is Protocol Buffers and I
> would highly recommend to use that,
> One option is to transform your Pojo into a Protobuf just before entering
> SF, and you can convert it back to your original Pojo when exiting from SF.
> If you absolutely have to use something else, you can fall back to kryo[1]
> or provide your own[2] but then schema evaluation of your messages and
> state is not guaranteed anymore, and you
> lose the ability to communicate with remote functions.
>
> Would the appendingBuffer be
>> de-/serialized for each function invocation?
>
>
> The appending buffer supports efficient appends (the buffer is *not*
> deserialized on every function invocation)
> In fact, it is backed by Flink's ListState[3]
>
> [1] -
> https://github.com/apache/flink-statefun/blob/master/statefun-examples/statefun-flink-datastream-example/src/main/java/org/apache/flink/statefun/examples/datastream/Example.java#L62
> [2] -
> https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/message/MessagePayloadSerializer.java
> [3] -
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-keyed-state
>
> Thanks,
> Igal
>
> On Wed, Sep 2, 2020 at 10:51 AM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
>> Hi Dan,
>>
>> thanks for reaching out to the Flink community. I'm pulling in Gordon and
>> Igal who will be able to answer your questions.
>>
>> Cheers,
>> Till
>>
>> On Wed, Sep 2, 2020 at 8:22 AM danp <dan.pettersso...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Nice to see the progress of Stateful functions.
>>>
>>> I have a few questions that I hope you can reply to.
>>>
>>> My first question is regarding the newly implemented
>>> StatefulFunctionDataStreamBuilder.
>>> Is there anything to pay attention to if one first union a couple of
>>> streams
>>> and performs a sort via a keyBy and a KeyedProcessFunction before
>>> dispatching the messages to via RoutableMessage?
>>> In this sorting I'm using a mapstate and I guess my question is if one
>>> can
>>> freely mix Flink core with SF's code with regards to performance,
>>> fault-tolerance, and checkpointing?
>>>
>>> I'm unable to use Protobuf so POJO's is going to be processed. Is there a
>>> large performance impact of using Pojos instead of protos in SF?
>>>
>>> Also I wonder if there's going to be a severe performance penalty if I
>>> had a
>>> function that was called very often lets say 1000 a second and hold a
>>> PersistentAppendingBuffer with those objects appended for each message?
>>> Then
>>> when the 1001:st message comes or a timetrigger kicks in, I would write
>>> everything to disk and delete the state. Would the appendingBuffer be
>>> de-/serialized for each function invocation?
>>>
>>> If yes, is there any workaround for this so the data is just held in RAM?
>>>
>>> Thanks,
>>>
>>> Regards
>>> Dan
>>>
>>>
>>>
>>> --
>>> Sent from:
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>>
>>

Re: A couple of question for Stateful Functions

Reply via email to