+1 Portability is the way forward. If you have to choose between the two, go for the portable one. For educational purposes, I'd still suggest checking out the "legacy" Runners. Actually, a new Runner could implement both Runner styles with most of the code shared between the two.

-Max

On 15.05.19 11:47, Robert Bradshaw wrote:
I would strongly suggest new runners adapt the portability runner from
the start, which will be more forward compatible and more flexible
(e.g. supporting other languages). The primary difference is that
rather than wrapping individual DoFns, one wraps a "fused" bundle of
DoFns (called an ExecutableStage). As it looks liek Twister2 is
written in Java, you can take advantage of much of the existing Java
libraries that already do this that are shared among the other Java
runners.

On Tue, May 14, 2019 at 7:55 PM Pulasthi Supun Wickramasinghe
<[email protected]> wrote:

Hi,

Thanks Kenn and Max for the information. Will read up a little more and discuss 
with the Twister2 team before deciding on which route to take. I also created 
an issue in BEAM JIRA[1], but I cannot assign this to my self would someone be 
able to assign the issue to me. Thanks in advance.

[1] https://issues.apache.org/jira/browse/BEAM-7304

Best Regards
Pulasthi

On Tue, May 14, 2019 at 6:19 AM Maximilian Michels <[email protected]> wrote:

Hi Pulasthi,

Great to hear you're planning to implement a Twister2 Runner.

If you have limited time, you probably want to decide whether to build a
"legacy" Java Runner or a portable one. They are not fundamentally
different but there are some tricky implementation details for the
portable Runner related to the asynchronous communication with the SDK
Harness.

If you have enough time, first implementing a "legacy" Runner might be a
good way to learn the Beam model and subsequently creating a portable
Runner should not be hard then.

To get an idea of the differences, check out the Flink source code:
- FlinkStreamingTransformTranslators (Java "legacy")
- FlinkStreamingPortablePipelineTranslator (portable)

Feel free to ask questions here or on Slack.

Cheers,
Max

On 14.05.19 05:11, Kenneth Knowles wrote:
Welcome! This is very cool to hear about.

A major caveat about https://beam.apache.org/contribute/runner-guide/ is
that it was written when Beam's portability framework was more of a
sketch. The conceptual descriptions are mostly fine, but the pointers to
Java helper code will lead you to build a "legacy" runner when it is
better to build a portable runner from the start*.

We now have four portable runners in various levels of completeness:
Spark, Flink, Samza, and Dataflow. I have added some relevant people to
the CC for emphasis. You might also join
https://the-asf.slack.com/#beam-portability though I prefer the dev list
since it gives visibility to a much greater portion of the community.

Kenn

*volunteers welcome to update the guide to emphasize portability first

*From: *Pulasthi Supun Wickramasinghe <[email protected]
<mailto:[email protected]>>
*Date: *Mon, May 13, 2019 at 11:03 AM
*To: * <[email protected] <mailto:[email protected]>>

     Hi All,

     I am Pulasthi a Ph.D. student at Indiana University. We are planning
     to develop a beam runner for our project Twister2 [1] [2]. Twister2
     is a big data framework which supports both batch and stream
     processing. If you are interested you can find more information on
     [2] or read some of our publications [3]

     I wanted to share our intent and get some guidance from the beam
     developer community before starting on the project. I was planning
     on going through the code for Apache Spark and Apache Flink runners
     to get a better understanding of what I need to do. It would be
     great if I can get any pointers on how I should approach this
     project. I am currently reading through the runner-guide
     <https://beam.apache.org/contribute/runner-guide/>.

     Finally, I assume that I need to create a JIRA issue to track the
     progress of this project, right?. I can create the issue but from
     what I read from the contribute section I would need some permission
     to assign it to my self, I hope someone would be able to help me
     with that. Looking forward to working with the Beam community.

     [1] https://github.com/DSC-SPIDAL/twister2
     [2] https://twister2.gitbook.io/twister2/
     [3] https://twister2.gitbook.io/twister2/publications

     Best Regards,
     Pulasthi
     --
     Pulasthi S. Wickramasinghe
     PhD Candidate  | Research Assistant
     School of Informatics and Computing | Digital Science Center
     Indiana University, Bloomington
     cell: 224-386-9035




--
Pulasthi S. Wickramasinghe
PhD Candidate  | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
cell: 224-386-9035

Reply via email to