Hey All, Quick weekely update on the streaming project:
It was a good week we implemented a lot of new features and made considerable work on the api too. Most notably: - Cluster performance was measured against Storm on both simple streaming wordcount and iterative algorithm (pagerank) and Flink Streaming was about 4 times faster. (partly because the output buffers) - The API now support simple types instead of Tuple1s (so instead of DataStream<Tuple1<String>> you can use DataStream<String>) - The API was updated to match the new function interfaces as in the Main project, with both the standard and the Rich functions - The Directed emit api has been updated to include SplitDataStreams and a .select(name) method to direct tuples to named outputs. - We have added a .groupBy(..) operator to use with .reduce, .batchReduce and .windowReduce allowing a streaming group-reduce on the whole stream, sliding batches, and sliding time windows. - We have also started refactoring our tests to run in much less time to avoid travis errors :) You can check all these changes on git of course: https://github.com/mbalassi/incubator-flink/commits/streaming-ready Cheers, Gyula