Re: Thoughts on a reference runner to invest in?

Kenneth Knowles Mon, 11 Feb 2019 18:53:51 -0800

Interesting silence here. You've got it right that the reason we initially
chose Java was because of the cross-runner sharing. The reference runner
could be the first target runner for any new feature and then its work
could be directly (or indirectly via copy/paste/modify if it works better)
be used in other runners. Examples:


 - The implementations of (pre-portability) state & timers in
runners/core-java and prototyped in the Java DirectRunner made it a matter
of a couple of days to implement on other runners, and they saw pretty
quick adoption.
 - Probably the same could be said for the first drafts of the runners,
which re-used a bunch of runners/core-java and had each others' translation
code as a reference.

I'm interested if anyone would be willing to confirm if it is because the
FlinkRunner has forged ahead and the Dataflow worker is open source. It
makes sense that the code from a distributed runner is an even better
reference point if you are building another distributed runner. From the
look of it, the SamzaRunner had no trouble getting started on portability.

Kenn

On Mon, Feb 11, 2019 at 6:04 PM Daniel Oliveira <[email protected]>
wrote:

> Yeah, the FnApiRunner is what I'm leaning towards too. I wasn't sure how
> much demand there was for an actual reference implementation in Java
> though, so I was hoping there were runner authors that would want to chime
> in.
>
> On the other hand, the Flink runner could serve as a reference
> implementation for portable features since it's further along, so maybe
> it's not an issue regardless.
>
> On Mon, Feb 11, 2019 at 1:09 PM Sam Rohde <[email protected]> wrote:
>
>> Thanks for starting this thread. If I had to guess, I would say there is
>> more of a demand for Python as it's more widely used for data scientists/
>> analytics. Being pragmatic, the FnApiRunner already has more feature work
>> than the Java so we should go with that.
>>
>> -Sam
>>
>> On Fri, Feb 8, 2019 at 10:07 AM Daniel Oliveira <[email protected]>
>> wrote:
>>
>>> Hello Beam dev community,
>>>
>>> For those who don't know me, I work for Google and I've been working on
>>> the Java reference runner, which is a portable, local Java runner (it's
>>> basically the direct runner with the portability APIs implemented). Our
>>> goal in working on this was to have a portable runner which ran locally so
>>> it could be used by users for testing portable pipelines, devs for testing
>>> new features with portability, and for runner authors to provide a simple
>>> reference implementation of a portable runner.
>>>
>>> Due to various circumstances though, progress on the Java reference
>>> runner has been pretty slow, and a Python runner which does pretty much the
>>> same things was made to aid portability development in Python (called the
>>> FnApiRunner). This runner is currently further along in feature work than
>>> the Java reference runner, so we've been reevaluating if we should switch
>>> to investing in it instead.
>>>
>>> My question to the community is: Which runner do you think would be more
>>> valuable to the dev community and Beam users? For those of you who are
>>> runner authors, do you have a preference for what language you'd like to
>>> see a reference implementation in?
>>>
>>> Thanks,
>>> Daniel Oliveira
>>>
>>

Re: Thoughts on a reference runner to invest in?

Reply via email to