On Mon, Nov 21, 2016 at 6:27 PM Joey Frazee <joey.fra...@icloud.com> wrote:
> I'm in favor of this for a few reasons: > > - There are enough stream processing frameworks out there that it makes it > hard for us to offer much on that front. I don't think streams fills a gap > for this internally so we have more to contribute in creating something > that people can use with Beam. > > - It should help make the story clearer to outsiders on "how to run > streams". > > - While it may be immature, as Trevor and Suneel mention, I think they can > probably do as good a job keeping the interfaces stable as we can in > maintaining runtimes and interfaces internally. We'll give up some control > but they'll do a good job too. > I am +1 for finding another (defacto) standard for stream processing APIs; but I also worry about a hard dependency on Beam. It would be nice if there were another alternative similar to ReactiveStreams[1] but with a more flexible model. [1] http://www.reactive-streams.org/ > Now there will for sure be some drawbacks. We'll be beholden to someone > else and probably have to scramble to stay up to date sometimes. And it's > naive to think it's ever going to provide for every feature of the > underlying runner, so we might find ourselves in situations where something > that should be easy is hard. > > -joey > > > On Nov 21, 2016, at 3:43 PM, sblackmon <sblack...@apache.org> wrote: > > > > > > > >> On November 21, 2016 at 2:19:11 PM, Suneel Marthi ( > suneel.mar...@gmail.com(mailto:suneel.mar...@gmail.com)) wrote: > >> > >> I agree too, I have been playing with Beam for a few months now without > a > >> runner and the API is still immature, but nevertheless keep it on the > radar > >> since its gonna be a TLP soon. > >> > >> > >> From Streams perspective, how do we see the project using Beam (similar > to > >> Spark/flink now); if so we can preliminary version of Beam support with > >> Local Dataflow runner. > >> > > > > Hypothesis expanded: > > > > We could implement all the components in the project (providers, > persister, and processors) directly against > > Beam APIs (Source, Sink, DoFn, etc…) and support two primary execution > models for project capabilities: > > > > 1) direct instantiation of a single instance of a component, call beam > equivalents of setup, process, teardown yourself. This is common throughout > project unit and integration tests already. > > 2) compose a beam Pipeline combining Streams and non-Streams components, > run with your preferred beam runner(s). > > > > In this scenario I think streams-runtimes would either go away entirely > or only contain helper methods (no classes with a static main) > > > >> > >> > >> On Mon, Nov 21, 2016 at 3:14 PM, Trevor Grant > >> wrote: > >> > >>> IMHO, Beam is too immature and the API is to unstable at this time to > >>> integrate, however I am in favor of watching the Beam project develop > and > >>> starting to think through what an integration might look like. > >>> > >>> Just my .02, based on some fairly lack-luster experiences with Apache > Beam. > >>> > >>> tg > >>> > >>> > >>> > >>> > >>> Trevor Grant > >>> Data Scientist > >>> https://github.com/rawkintrevo > >>> http://stackexchange.com/users/3002022/rawkintrevo > >>> http://trevorgrant.org > >>> > >>> *"Fortunate is he, who is able to know the causes of things." -Virgil* > >>> > >>> > >>>> On Mon, Nov 21, 2016 at 11:36 AM, sblackmon wrote: > >>>> > >>>> Beam appears to be on it’s way to being the de-facto standard for data > >>>> pipelines. > >>>> > >>>> I’d like to start a real discussion about whether and how to align > >>> streams > >>>> interfaces with Beam interfaces. > >>>> > >>>> To pose a straw-man theory for discussion: > >>>> > >>>> Hypothesis: Streams would benefit by replacing the interfaces in > >>>> streams-core entirely with beam interfaces. > >>>> > >>>> a) Do we agree that the flexibility and performance gains from doing > so, > >>>> presuming it’s possible, would be significant? > >>>> b) Are there any inevitable flexiblility, performance, complexity, or > >>>> other, blockers or compromises we should discuss? > >>>> c) What arguments are there for retaining our interfaces and providing > >>>> beam compatibility in a runtime module binding (within streams) vs > >>>> deprecating our existing interfaces and switching over completely? > >>>> d) Obviously doing this would be a lot of work. What level of > commitment > >>>> is there from the group to work on this? > >>>> > >>>> Steve > >>>> On October 25, 2016 at 3:47:11 PM, sblackmon (sblack...@apache.org) > >>> wrote: > >>>> > >>>> Regarding Beam, there have been a number of ideas and theories > floated on > >>>> the list and but nothing concrete has been proposed or discussed in > >>> depth. > >>>> > >>>> Steve > >>>> On October 25, 2016 at 10:21:52 AM, Suneel Marthi ( > >>> suneel.mar...@gmail.com) > >>>> wrote: > >>>> > >>>> Is support for Kafka Streams and Apache Beam on the roadmap ? > >>>> > >>> > > >