Hello to everybody! Sorry for the delay, but here comes a short introduction for the new features!
We are glad to announce that we've introduced newly developed and high flexible windowing semantics in flink-streaming. Flink-streaming allows now to define windows using policies. There are two types of policies, Trigger- and Eviction-policies. The trigger policy specifies when the reduce function gets executed on the current element buffer (respectively the current window) and the result gets emitted. The eviction policy specifies when elements get removed from the buffer (window size). We provide a wide range of pre-defined policies, such as count-based, time-based, punctuation-based and delta-based. Punctuation allows to specify windows externally by putting flags at some elements in the stream. Delta-policies can apply a distance measure to decide when to trigger or evict. Using the provided set of predefined policies users can already define a huge variate of windows. For example: - Hold elements of the last 5 seconds, while the user defined aggregation/reduce is executed on the windows every second (sliding the window by 1 second): - dataStream.window(Time.of(5, TimeUnit.SECONDS)).every(Time.of(1, TimeUnit.SECONDS)) - Downsample our stream: Take the latest 100 elements of our stream every minute: - dataStream.window(Count.of(100)).every(Time.of(1, TimeUnit.MINUTES)) - Using combinations of multiple different policies, we can go even further: - Every 5 minutes give me the aggregations of the last 1 mio. elements but never include elements older than 10 minutes. - Emit every 5 minutes, but immediately in case the current value differs more than X% from the last one. Additionally, policies can be implemented as UDFs, which gives the user the highest possible flexibility for the specification of windows. The new policy based windowing can also be used for grouped streams. For example: To get the maximal value by key on the last 100 elements we use the following approach: - dataStream.window(Count.of(100)).every(…).groupBy(groupingField).max(field); To create fixed size windows for every key we need to reverse the order of the groupBy call. So to take the max for the last 100 elements in Each group: - dataStream.groupBy(groupingField).window(Count.of(100)).every(…).max(field); This will create separate windows for different keys and apply the trigger and eviction policies on a per group basis. We extended the streaming guide[1] with a documentation about the new features. Additionally you can find several examples in the windowing package[2]. Kind Regards Jonas Traub, Gyula Fóra, Paris Carbone [1] https://github.com/apache/incubator-flink/blob/master/docs/streaming_guide.md [2] https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/windowing 2014-12-06 0:56 GMT+01:00 Henry Saputra <henry.sapu...@gmail.com>: > We actually need to bring these work come to light early so we could > give feedback before getting to deep. > > Marton, could you encourage Paris and Jonas to publish or share their > thoughts about the work they are doing? > > Would love to give some feedbacks. > > - Henry > > On Fri, Dec 5, 2014 at 12:38 PM, Kostas Tzoumas <ktzou...@apache.org> > wrote: > > Super! > > > > Paris and Jonas have been doing a lot of "silent" work (= not very > visible > > in the mailing list) recently. Looking forward to seeing this coming in. > > > > On Fri, Dec 5, 2014 at 8:00 PM, Henry Saputra <henry.sapu...@gmail.com> > > wrote: > > > >> Nice! Looking forward to it > >> > >> On Fri, Dec 5, 2014 at 9:34 AM, Márton Balassi < > balassi.mar...@gmail.com> > >> wrote: > >> > Hey, > >> > > >> > As you may know in some European countries Santa Claus comes on the > 6th > >> of > >> > December, Saint Nicolas Day - from Finland. > >> > > >> > As Flink is not yet very active in Finland our Sweden team stepped in > and > >> > gave us a big present with a huge rework on the windowing semantics - > >> this > >> > is the implementation of the ideas that were posed the Stockholm > >> hackathon > >> > in October. > >> > > >> > Big thanks to Jonas, Paris and Gyula. Jonas is posting a detailed > intro > >> to > >> > the new features here soon. > >> > > >> > Marton > >> >