Hi All, Thanks for all the feedback, As suggested I will directly start working on a portable runner. Will update the JIRA as I make progress.
Best Regards Pulasthi On Wed, May 15, 2019 at 8:13 AM Maximilian Michels <[email protected]> wrote: > +1 Portability is the way forward. If you have to choose between the > two, go for the portable one. For educational purposes, I'd still > suggest checking out the "legacy" Runners. Actually, a new Runner could > implement both Runner styles with most of the code shared between the two. > > -Max > > On 15.05.19 11:47, Robert Bradshaw wrote: > > I would strongly suggest new runners adapt the portability runner from > > the start, which will be more forward compatible and more flexible > > (e.g. supporting other languages). The primary difference is that > > rather than wrapping individual DoFns, one wraps a "fused" bundle of > > DoFns (called an ExecutableStage). As it looks liek Twister2 is > > written in Java, you can take advantage of much of the existing Java > > libraries that already do this that are shared among the other Java > > runners. > > > > On Tue, May 14, 2019 at 7:55 PM Pulasthi Supun Wickramasinghe > > <[email protected]> wrote: > >> > >> Hi, > >> > >> Thanks Kenn and Max for the information. Will read up a little more and > discuss with the Twister2 team before deciding on which route to take. I > also created an issue in BEAM JIRA[1], but I cannot assign this to my self > would someone be able to assign the issue to me. Thanks in advance. > >> > >> [1] https://issues.apache.org/jira/browse/BEAM-7304 > >> > >> Best Regards > >> Pulasthi > >> > >> On Tue, May 14, 2019 at 6:19 AM Maximilian Michels <[email protected]> > wrote: > >>> > >>> Hi Pulasthi, > >>> > >>> Great to hear you're planning to implement a Twister2 Runner. > >>> > >>> If you have limited time, you probably want to decide whether to build > a > >>> "legacy" Java Runner or a portable one. They are not fundamentally > >>> different but there are some tricky implementation details for the > >>> portable Runner related to the asynchronous communication with the SDK > >>> Harness. > >>> > >>> If you have enough time, first implementing a "legacy" Runner might be > a > >>> good way to learn the Beam model and subsequently creating a portable > >>> Runner should not be hard then. > >>> > >>> To get an idea of the differences, check out the Flink source code: > >>> - FlinkStreamingTransformTranslators (Java "legacy") > >>> - FlinkStreamingPortablePipelineTranslator (portable) > >>> > >>> Feel free to ask questions here or on Slack. > >>> > >>> Cheers, > >>> Max > >>> > >>> On 14.05.19 05:11, Kenneth Knowles wrote: > >>>> Welcome! This is very cool to hear about. > >>>> > >>>> A major caveat about https://beam.apache.org/contribute/runner-guide/ > is > >>>> that it was written when Beam's portability framework was more of a > >>>> sketch. The conceptual descriptions are mostly fine, but the pointers > to > >>>> Java helper code will lead you to build a "legacy" runner when it is > >>>> better to build a portable runner from the start*. > >>>> > >>>> We now have four portable runners in various levels of completeness: > >>>> Spark, Flink, Samza, and Dataflow. I have added some relevant people > to > >>>> the CC for emphasis. You might also join > >>>> https://the-asf.slack.com/#beam-portability though I prefer the dev > list > >>>> since it gives visibility to a much greater portion of the community. > >>>> > >>>> Kenn > >>>> > >>>> *volunteers welcome to update the guide to emphasize portability first > >>>> > >>>> *From: *Pulasthi Supun Wickramasinghe <[email protected] > >>>> <mailto:[email protected]>> > >>>> *Date: *Mon, May 13, 2019 at 11:03 AM > >>>> *To: * <[email protected] <mailto:[email protected]>> > >>>> > >>>> Hi All, > >>>> > >>>> I am Pulasthi a Ph.D. student at Indiana University. We are > planning > >>>> to develop a beam runner for our project Twister2 [1] [2]. > Twister2 > >>>> is a big data framework which supports both batch and stream > >>>> processing. If you are interested you can find more information > on > >>>> [2] or read some of our publications [3] > >>>> > >>>> I wanted to share our intent and get some guidance from the beam > >>>> developer community before starting on the project. I was > planning > >>>> on going through the code for Apache Spark and Apache Flink > runners > >>>> to get a better understanding of what I need to do. It would be > >>>> great if I can get any pointers on how I should approach this > >>>> project. I am currently reading through the runner-guide > >>>> <https://beam.apache.org/contribute/runner-guide/>. > >>>> > >>>> Finally, I assume that I need to create a JIRA issue to track the > >>>> progress of this project, right?. I can create the issue but from > >>>> what I read from the contribute section I would need some > permission > >>>> to assign it to my self, I hope someone would be able to help me > >>>> with that. Looking forward to working with the Beam community. > >>>> > >>>> [1] https://github.com/DSC-SPIDAL/twister2 > >>>> [2] https://twister2.gitbook.io/twister2/ > >>>> [3] https://twister2.gitbook.io/twister2/publications > >>>> > >>>> Best Regards, > >>>> Pulasthi > >>>> -- > >>>> Pulasthi S. Wickramasinghe > >>>> PhD Candidate | Research Assistant > >>>> School of Informatics and Computing | Digital Science Center > >>>> Indiana University, Bloomington > >>>> cell: 224-386-9035 > >>>> > >> > >> > >> > >> -- > >> Pulasthi S. Wickramasinghe > >> PhD Candidate | Research Assistant > >> School of Informatics and Computing | Digital Science Center > >> Indiana University, Bloomington > >> cell: 224-386-9035 > -- Pulasthi S. Wickramasinghe PhD Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington cell: 224-386-9035
