To extend the functionality of Flink a separate branch of development was dedicated for low latency, distributed stream processing support. The development started during March of 2014 and is approaching a state where it might be considered a candidate for becoming part of the main repository.
As of today a WordCount <https://github.com/stratosphere/stratosphere-streaming/blob/master/src/main/java/eu/stratosphere/streaming/examples/wordcount/WordCountLocal.java#L30-41> example streaming program would fairly similar to the one that the batch API provides: StreamExecutionEnvironment env = new StreamExecutionEnvironment(); DataStream<Tuple2<String, Integer>> dataStream = env.readTextFile("src/test/resources/testdata/hamlet.txt") .flatMap(new WordCountSplitter()) .partitionBy(0) .map(new WordCountCounter()); dataStream.print(); env.execute(); The user defined functions are extending the same classes as in the batch case (e.g. a FlatMapFunction for a flatmap, see WordCountSplitter <https://github.com/stratosphere/stratosphere-streaming/blob/master/src/main/java/eu/stratosphere/streaming/examples/wordcount/WordCountSplitter.java>) thus providing code interusability between the two approaches. As for performance the 0.1 version <https://github.com/stratosphere/stratosphere-streaming/tree/release-0.1> released in the beginning of June was slightly better on a single core then Apache Storm, one of the major players of the field. Cluster performance needs further optimization. This version provided a lower level API, fairly similar to the one Storm has. For a deeper dive on this state of the development and the challenges faced please refer to the slides <http://info.ilab.sztaki.hu/~mbalassi/dw_forum_2014/dw_forum_streaming.pdf> of a talk form the early days of June. The 0.2 release is coming soon with the the above demonstrated new API and improved single core performance. To complete the release the cluster performance is being measured, and the code is being decomposed into three subprojects separating core, example and addon functionality. As for the future fault tolerance is an unresolved issue and as a part of the Google Summer of Code project an intern is working on iterative stream processing. The project is mainly developed at Budapest by three members employed by Hungarian Academy of Sciences and Eötvös Loránd University and Frank Wu, our Google Summer of Code student from Singapore. This summer the Hungarian Academy of Sciences also dedicated 4 interns to the project. The proposed 0.2 release is still dependant on the 0.5 release of Stratosphere, however on branch snapshot-0.6 <https://github.com/stratosphere/stratosphere-streaming/tree/snapshot-0.6> the dependencies are updated to 0.6-snapshot, thus the codebase is ready for becoming part of the main project - preferably a part of addons until it becomes stable. Looking forward to your suggestions. Cheers, Márton, Gyula & Gábor