Hi,

Thanks Kenn and Max for the information. Will read up a little more and
discuss with the Twister2 team before deciding on which route to take. I
also created an issue in BEAM JIRA[1], but I cannot assign this to my self
would someone be able to assign the issue to me. Thanks in advance.

[1] https://issues.apache.org/jira/browse/BEAM-7304

Best Regards
Pulasthi

On Tue, May 14, 2019 at 6:19 AM Maximilian Michels <m...@apache.org> wrote:

> Hi Pulasthi,
>
> Great to hear you're planning to implement a Twister2 Runner.
>
> If you have limited time, you probably want to decide whether to build a
> "legacy" Java Runner or a portable one. They are not fundamentally
> different but there are some tricky implementation details for the
> portable Runner related to the asynchronous communication with the SDK
> Harness.
>
> If you have enough time, first implementing a "legacy" Runner might be a
> good way to learn the Beam model and subsequently creating a portable
> Runner should not be hard then.
>
> To get an idea of the differences, check out the Flink source code:
> - FlinkStreamingTransformTranslators (Java "legacy")
> - FlinkStreamingPortablePipelineTranslator (portable)
>
> Feel free to ask questions here or on Slack.
>
> Cheers,
> Max
>
> On 14.05.19 05:11, Kenneth Knowles wrote:
> > Welcome! This is very cool to hear about.
> >
> > A major caveat about https://beam.apache.org/contribute/runner-guide/ is
>
> > that it was written when Beam's portability framework was more of a
> > sketch. The conceptual descriptions are mostly fine, but the pointers to
> > Java helper code will lead you to build a "legacy" runner when it is
> > better to build a portable runner from the start*.
> >
> > We now have four portable runners in various levels of completeness:
> > Spark, Flink, Samza, and Dataflow. I have added some relevant people to
> > the CC for emphasis. You might also join
> > https://the-asf.slack.com/#beam-portability though I prefer the dev
> list
> > since it gives visibility to a much greater portion of the community.
> >
> > Kenn
> >
> > *volunteers welcome to update the guide to emphasize portability first
> >
> > *From: *Pulasthi Supun Wickramasinghe <pulasthi...@gmail.com
> > <mailto:pulasthi...@gmail.com>>
> > *Date: *Mon, May 13, 2019 at 11:03 AM
> > *To: * <dev@beam.apache.org <mailto:dev@beam.apache.org>>
> >
> >     Hi All,
> >
> >     I am Pulasthi a Ph.D. student at Indiana University. We are planning
> >     to develop a beam runner for our project Twister2 [1] [2]. Twister2
> >     is a big data framework which supports both batch and stream
> >     processing. If you are interested you can find more information on
> >     [2] or read some of our publications [3]
> >
> >     I wanted to share our intent and get some guidance from the beam
> >     developer community before starting on the project. I was planning
> >     on going through the code for Apache Spark and Apache Flink runners
> >     to get a better understanding of what I need to do. It would be
> >     great if I can get any pointers on how I should approach this
> >     project. I am currently reading through the runner-guide
> >     <https://beam.apache.org/contribute/runner-guide/>.
> >
> >     Finally, I assume that I need to create a JIRA issue to track the
> >     progress of this project, right?. I can create the issue but from
> >     what I read from the contribute section I would need some permission
> >     to assign it to my self, I hope someone would be able to help me
> >     with that. Looking forward to working with the Beam community.
> >
> >     [1] https://github.com/DSC-SPIDAL/twister2
> >     [2] https://twister2.gitbook.io/twister2/
> >     [3] https://twister2.gitbook.io/twister2/publications
> >
> >     Best Regards,
> >     Pulasthi
> >     --
> >     Pulasthi S. Wickramasinghe
> >     PhD Candidate  | Research Assistant
> >     School of Informatics and Computing | Digital Science Center
> >     Indiana University, Bloomington
> >     cell: 224-386-9035
> >
>


-- 
Pulasthi S. Wickramasinghe
PhD Candidate  | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
cell: 224-386-9035

Reply via email to