Hi

according to presentation of Tyler Akidau
https://docs.google.com/presentation/d/13YZy2trPugC8Zr9M8_TfSApSCZBGUDZdzi-WUz95JJw/present
Flink
supports late arrivals for window processing, while I've seen several
question in the userlist regarding late arrivals and answer was - sort of
"not for all usecases"
Can somebody clarify?

The interesting case for me - I have event processing time, while I want to
aggregate by tumbling window. The events come from kafka and might be late.
Currently we define lateness threshold with watermark (e.g. 5 mins)

After window triggers I want to save aggregated result at some persistent
storage(redis/hbase) with start timestamp of window

After this grace period - if I understand correctly - any event won't be
aggregated into existing window, but rather the trigger will call
aggregated function with only 1 element inside(the late one)

so if my window method saves into persistent storage - it will override
aggregated result with new one that has only 1 element inside

what I want to achieve - is that late arrival will trigger window method
with all elements (late + all other) so that aggregated result will be
complete

you can think about use case of page visits counts per minute, while due to
some problems page visit events might arrive late

thanks in advance

Reply via email to