Adding the streaming project to the main repository

Gyula Fóra Fri, 08 Aug 2014 06:54:07 -0700

Hey All,

Quick weekely update on the streaming project:


It was a good week we implemented a lot of new features and made
considerable work on the api too. Most notably:

- Cluster performance was measured against Storm on both simple streaming
wordcount and iterative algorithm (pagerank) and Flink Streaming was about
4 times faster. (partly because the output buffers)

- The API now support simple types instead of Tuple1s (so instead of
DataStream<Tuple1<String>> you can use DataStream<String>)

- The API was updated to match the new function interfaces as in the Main
project, with both the standard and the Rich functions

- The Directed emit api has been updated to include SplitDataStreams and a
.select(name) method to direct tuples to named outputs.

- We have added a .groupBy(..) operator to use with .reduce, .batchReduce
and .windowReduce allowing a streaming group-reduce on the whole stream,
sliding batches, and sliding time windows.

- We have also started refactoring our tests to run in much less time to
avoid travis errors :)

You can check all these changes on git of course:

https://github.com/mbalassi/incubator-flink/commits/streaming-ready


Cheers,
Gyula

Adding the streaming project to the main repository

Reply via email to