Hello to everybody!

Sorry for the delay, but here comes a short introduction for the new
features!

We are glad to announce that we've introduced newly developed and high
flexible windowing semantics in flink-streaming.

Flink-streaming allows now to define windows using policies. There are two
types of policies, Trigger- and Eviction-policies. The trigger policy
specifies when the reduce function gets executed on the current element
buffer (respectively the current window) and the result gets emitted. The
eviction policy specifies when elements get removed from the buffer (window
size).

We provide a wide range of pre-defined policies, such as count-based,
time-based, punctuation-based and delta-based.
Punctuation allows to specify windows externally by putting flags at some
elements in the stream. Delta-policies can apply a distance measure to
decide when to trigger or evict.

Using the provided set of predefined policies users can already define a
huge variate of windows. For example:

   - Hold elements of the last 5 seconds, while the user defined
   aggregation/reduce is executed on the windows every second (sliding the
   window by 1 second):
      - dataStream.window(Time.of(5, TimeUnit.SECONDS)).every(Time.of(1,
      TimeUnit.SECONDS))


   - Downsample our stream: Take the latest 100 elements of our stream
   every minute:
      - dataStream.window(Count.of(100)).every(Time.of(1, TimeUnit.MINUTES))


   - Using combinations of multiple different policies, we can go even
   further:
      - Every 5 minutes give me the aggregations of the last 1 mio.
      elements but never include elements older than 10 minutes.
      - Emit every 5 minutes, but immediately in case the current value
      differs more than X% from the last one.

Additionally, policies can be implemented as UDFs, which gives the user the
highest possible flexibility for the specification of windows.

The new policy based windowing can also be used for grouped streams.
For example: To get the maximal value by key on the last 100 elements we
use the following approach:

   -
   dataStream.window(Count.of(100)).every(…).groupBy(groupingField).max(field);

To create fixed size windows for every key we need to reverse the order of
the groupBy call. So to take the max for the last 100 elements in Each
group:

   -
   dataStream.groupBy(groupingField).window(Count.of(100)).every(…).max(field);

This will create separate windows for different keys and apply the trigger
and eviction policies on a per group basis.

We extended the streaming guide[1] with a documentation about the new
features. Additionally you can find several examples in the windowing
package[2].

Kind Regards
Jonas Traub, Gyula Fóra, Paris Carbone

[1]
https://github.com/apache/incubator-flink/blob/master/docs/streaming_guide.md
[2]
https://github.com/apache/incubator-flink/tree/master/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/windowing

2014-12-06 0:56 GMT+01:00 Henry Saputra <henry.sapu...@gmail.com>:

> We actually need to bring these work come to light early so we could
> give feedback before getting to deep.
>
> Marton, could you encourage Paris and Jonas to publish or share their
> thoughts about the work they are doing?
>
> Would love to give some feedbacks.
>
> - Henry
>
> On Fri, Dec 5, 2014 at 12:38 PM, Kostas Tzoumas <ktzou...@apache.org>
> wrote:
> > Super!
> >
> > Paris and Jonas have been doing a lot of "silent" work (= not very
> visible
> > in the mailing list) recently. Looking forward to seeing this coming in.
> >
> > On Fri, Dec 5, 2014 at 8:00 PM, Henry Saputra <henry.sapu...@gmail.com>
> > wrote:
> >
> >> Nice! Looking forward to it
> >>
> >> On Fri, Dec 5, 2014 at 9:34 AM, Márton Balassi <
> balassi.mar...@gmail.com>
> >> wrote:
> >> > Hey,
> >> >
> >> > As you may know in some European countries Santa Claus comes on the
> 6th
> >> of
> >> > December, Saint Nicolas Day - from Finland.
> >> >
> >> > As Flink is not yet very active in Finland our Sweden team stepped in
> and
> >> > gave us a big present with a huge rework on the windowing semantics -
> >> this
> >> > is the implementation of the ideas that were posed the Stockholm
> >> hackathon
> >> > in October.
> >> >
> >> > Big thanks to Jonas, Paris and Gyula. Jonas is posting a detailed
> intro
> >> to
> >> > the new features here soon.
> >> >
> >> > Marton
> >>
>

Reply via email to