Re: Flink Runner - Current State & Roadmap

James Malone Fri, 12 Feb 2016 17:06:32 -0800

I have the signed SGAs from data Artisans and Cloudera. They were both
awesomely quick (thank you!)  I am waiting on the signed Google agreement
and then we should be set. :)


On Fri, Feb 12, 2016 at 5:04 PM, Henry Saputra <[email protected]>
wrote:

> I am +1 to add Flink runner early as preview.
>
> Just small reminder to get DataArtisans software grant for the
> runner contribution ;)
>
> - Henry
>
> On Friday, February 12, 2016, Maximilian Michels <[email protected]> wrote:
>
> > Hi Beamers,
> >
> > Now that things are getting started and we discuss the technical
> > vision of Beam, we would like to contribute the Flink runner and start
> > by sharing some details about the status and the roadmap.
> >
> > The Flink Runner integrates deeply and naturally with the Dataflow SDK
> > (the Beam precursor), because the Flink DataStream API shares many
> > concepts with the Dataflow model.
> > Based on whether the program input is bounded or unbounded, the
> > program goes against Flink's DataStream or DataSet API.
> >
> > A quick preview at some of the nice features of the runner:
> >
> >   - Support for stream transformations, event time, watermarks
> >   - The full Dataflow windowing semantics, including fixed/sliding
> > time windows, and session windows
> >   - Integration with Flink's streaming sources (Kafka, RabbitMQ, ...)
> >
> >   - Batch (bounded sources) integrates fully with Flink's managed
> > memory techniques and out-of-core algorithms, supporting huge joins
> > and aggregations.
> >   - Integration with Flink's batch API sources (plain text, CSV, Avro,
> > JDBC, HBase, ...)
> >
> >   - Integration with Flink's fault tolerance - both batch and
> > streaming program recover from failures
> >   - After upgrading the dependency to Flink 1.0, one could even use
> > the Flink Savepoints feature (save streaming state for later resuming)
> > with the Dataflow programs.
> >
> > Attached you can find the document we drew up with more information
> > about the current state of the Runner and the roadmap for its upcoming
> > features:
> >
> >
> >
> https://docs.google.com/document/d/1QM_X70VvxWksAQ5C114MoAKb1d9Vzl2dLxEZM4WYogo/edit?usp=sharing
> >
> > The Runner executes the quasi complete Beam streaming model (well,
> > Dataflow, actually, because Beam is not there, yet).
> >
> > Given the current excitement and buzz around Beam, we could add this
> > runner to the Beam repository and link it as a "preview" for the
> > people that want to get a feeling of what it will be like to write and
> > run streaming (unbounded) Beam programs. That would give people
> > something tangible until the actual Beam code is available.
> >
> > What do you think?
> >
> > Best,
> > Max
> >
>

Re: Flink Runner - Current State & Roadmap

Reply via email to