Super cool stuff On Fri, Jun 3, 2016 at 10:55 AM, Kostas Kloudas <k.klou...@data-artisans.com > wrote:
> You are welcome! > > > On Jun 3, 2016, at 4:40 PM, Igor Berman <igor.ber...@gmail.com> wrote: > > thanks Kosta > > On 3 June 2016 at 16:47, Kostas Kloudas <k.klou...@data-artisans.com> > wrote: > >> Hi Igor, >> >> To handle late events in Flink you would have to implement you own custom >> trigger. >> >> To see a relatively more complex example of such a trigger and how to >> implement it, >> you can have a look at this implementation: >> https://github.com/dataArtisans/beam_comp/blob/master/src/main/java/com/dataartisans/beam_comparison/customTriggers/EventTimeTriggerWithEarlyAndLateFiring.java >> >> Which implements the trigger described in this article (before the >> conclusions section) >> http://data-artisans.com/why-apache-beam/ >> >> Thanks, >> Kostas >> >> On Jun 3, 2016, at 2:55 PM, Igor Berman <igor.ber...@gmail.com> wrote: >> >> Hi >> >> according to presentation of Tyler Akidau >> https://docs.google.com/presentation/d/13YZy2trPugC8Zr9M8_TfSApSCZBGUDZdzi-WUz95JJw/present >> Flink >> supports late arrivals for window processing, while I've seen several >> question in the userlist regarding late arrivals and answer was - sort of >> "not for all usecases" >> Can somebody clarify? >> >> The interesting case for me - I have event processing time, while I want >> to aggregate by tumbling window. The events come from kafka and might be >> late. Currently we define lateness threshold with watermark (e.g. 5 mins) >> >> After window triggers I want to save aggregated result at some persistent >> storage(redis/hbase) with start timestamp of window >> >> After this grace period - if I understand correctly - any event won't be >> aggregated into existing window, but rather the trigger will call >> aggregated function with only 1 element inside(the late one) >> >> so if my window method saves into persistent storage - it will override >> aggregated result with new one that has only 1 element inside >> >> what I want to achieve - is that late arrival will trigger window method >> with all elements (late + all other) so that aggregated result will be >> complete >> >> you can think about use case of page visits counts per minute, while due >> to some problems page visit events might arrive late >> >> thanks in advance >> >> >> >> > >