Flink Runner - Current State & Roadmap

Maximilian Michels Fri, 12 Feb 2016 11:07:16 -0800

Hi Beamers,

Now that things are getting started and we discuss the technical
vision of Beam, we would like to contribute the Flink runner and start
by sharing some details about the status and the roadmap.


The Flink Runner integrates deeply and naturally with the Dataflow SDK
(the Beam precursor), because the Flink DataStream API shares many
concepts with the Dataflow model.
Based on whether the program input is bounded or unbounded, the
program goes against Flink's DataStream or DataSet API.

A quick preview at some of the nice features of the runner:

  - Support for stream transformations, event time, watermarks
  - The full Dataflow windowing semantics, including fixed/sliding
time windows, and session windows
  - Integration with Flink's streaming sources (Kafka, RabbitMQ, ...)

  - Batch (bounded sources) integrates fully with Flink's managed
memory techniques and out-of-core algorithms, supporting huge joins
and aggregations.
  - Integration with Flink's batch API sources (plain text, CSV, Avro,
JDBC, HBase, ...)

  - Integration with Flink's fault tolerance - both batch and
streaming program recover from failures
  - After upgrading the dependency to Flink 1.0, one could even use
the Flink Savepoints feature (save streaming state for later resuming)
with the Dataflow programs.

Attached you can find the document we drew up with more information
about the current state of the Runner and the roadmap for its upcoming
features:

https://docs.google.com/document/d/1QM_X70VvxWksAQ5C114MoAKb1d9Vzl2dLxEZM4WYogo/edit?usp=sharing

The Runner executes the quasi complete Beam streaming model (well,
Dataflow, actually, because Beam is not there, yet).

Given the current excitement and buzz around Beam, we could add this
runner to the Beam repository and link it as a "preview" for the
people that want to get a feeling of what it will be like to write and
run streaming (unbounded) Beam programs. That would give people
something tangible until the actual Beam code is available.

What do you think?

Best,
Max

Flink Runner - Current State & Roadmap

Reply via email to