Re: Thoughts on a reference runner to invest in?

Kenneth Knowles Thu, 14 Feb 2019 21:06:24 -0800

Interesting point about community and the fact that it didn't build a
Java-based ULR even though it has been a possibility for a long time.


It makes sense to me. A non-Java SDK needs portability to run on Beam's
distributed runners, so building the portable SDK harness is key, unlike
for Java. And to build it, a local portability-based runner is a great help
(can't really imagine doing it without one). And of course building it in
Python makes sense if you are steeped in Python.

Joking-but-not-Joking the best reference runner would probably be in some
less popular but very readable functional language so it is different from
every SDK :-). I've looked into it and discovered that gRPC support is not
great...

Kenn

On Thu, Feb 14, 2019 at 5:47 AM Robert Bradshaw <[email protected]> wrote:

> I think it's good to distinguish between direct runners (which would
> be good to have in every language, and can grow in sophistication with
> the userbase) and a fully universal reference runner. We should of
> course continue to grow and maintain the java-runners-core shared
> library, possibly as driven by the various production runners which
> has been the most productive to date. (The point about community is a
> good one. Unfortunately over the past 1.5 years the bigger Java
> community has not resulted in a more complete Java ULR (in terms of
> number of contributors or features/maturity), and it's unclear what
> would change that in the future.)
>
> It would be really great to have (at least) two completely separate
> implementations, but (at the moment at least) I see that as lower
> value than accelerating the efforts to get existing production runners
> onto portability.
>
> On Thu, Feb 14, 2019 at 2:01 PM Ismaël Mejía <[email protected]> wrote:
> >
> > This is a really interesting and important discussion. Having multiple
> > reference runners can have its pros and cons. It is all about
> > tradeoffs. From the end user point of view it can feel weird to deal
> > with tools and packaging of a different ecosystem, e.g. python devs
> > dealing with all the quirkiness of Java packaging, or the viceversa
> > Java developers dealing with pip and friends. So having a reference
> > runner per language would be more natural and help also valídate the
> > portability concept, however having multiple reference runners sounds
> > harder from the maintenance point of view.
> >
> > Most of the software in the domain of beam have been traditionally
> > written in Java so there is a BIG advantage of ready to use (and
> > mature) libraries and reusable components (also the reference runner
> > may profit of the librarires that Thomas and others in the community
> > have developed for multi runner s). This is a big win, but more
> > important, we can have more eyes looking and contributing improvemetns
> > and fixes that will benefit the reference runner and others.
> >
> > Having a reference runner per language would be nice but if we must
> > choose only one language I prefer it to be Java just because we have a
> > bigger community that can contribute and improve it. We may work on
> > making the distribution of such runner more easier or friendly for
> > users of different languages.
> >
> > On Wed, Feb 13, 2019 at 3:47 AM Robert Bradshaw <[email protected]>
> wrote:
> > >
> > > I agree, it's useful for runners that are used for tests (including
> testing SDKs) to push into the dark corners of what's allowed by the spec.
> I think this can be added (where they don't already exist) to existing
> non-production runners. (Whether a direct runner should be considered
> production or not depends on who you ask...)
> > >
> > > On Wed, Feb 13, 2019 at 2:49 AM Daniel Oliveira <
> [email protected]> wrote:
> > >>
> > >> +1 to Kenn's point. Regardless of whether we go with a Python runner
> or a Java runner, I think we should have at least one portable runner that
> isn't a production runner for the reasons he outlined.
> > >>
> > >> As for the rest of the discussion, it sounds like people are
> generally supportive of having the Python FnApiRunner as that runner, and
> using Flink as a reference implementation for portability in Java.
> > >>
> > >> On Tue, Feb 12, 2019 at 4:37 PM Kenneth Knowles <[email protected]>
> wrote:
> > >>>
> > >>>
> > >>> On Tue, Feb 12, 2019 at 8:59 AM Thomas Weise <[email protected]> wrote:
> > >>>>
> > >>>> The Java ULR initially provided some value for the portability
> effort as Max mentions. It helped to develop the shared library for all
> Java runners and the job server functionality.
> > >>>>
> > >>>> However, I think the same could have been accomplished by
> developing the Flink runner instead of the Java ULR from the get go. This
> is also what happened later last year when support for state, timers and
> metrics was added to the portable Flink runner first and the ULR still does
> not support those features [1].
> > >>>>
> > >>>> Since all (or most) Java based runners that are based on another
> ASF project support embedded execution, I think it might make sense to
> discontinue separate direct runners for Java and instead focus efforts on
> making the runners that folks would also use in production better?
> > >>>
> > >>>
> > >>> Caveat: if people only test using embedded execution of a production
> runner, they are quite likely to depend on quirks of that runner, such as
> bundle size, fusion, whether shuffle is also checkpoint, etc. I think
> there's a lot of value in an antagonistic testing runner, which is
> something the Java DirectRunner tried to do with GBK random ordering,
> checking illegal mutations, checking encodability. These were all driven by
> real user needs and each caught a lot of user bugs. That said, I wouldn't
> want to maintain an extra runner, but would like to put these into a
> portable runner, whichever it is.
> > >>>
> > >>> Kenn
> > >>>
> > >>>>
> > >>>>
> > >>>> As for Python (and hopefully soon Go), it makes a lot of sense to
> have a simple to use and stable runner that can be used for local
> development. At the moment, the Py FnApiRunner seems the best candidate to
> serve as reference for portability.
> > >>>>
> > >>>> On a related note, we should probably also consider making pure
> Java pipeline execution via portability framework on a Java runner simpler
> and more efficient. We already use embedded environment for testing. If we
> also inline/embed the job server and this becomes readily available and
> easy to use, it might improve chances of other runners migrating to
> portability sooner.
> > >>>>
> > >>>> Thomas
> > >>>>
> > >>>> [1] https://s.apache.org/apache-beam-portability-support-table
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Tue, Feb 12, 2019 at 3:34 AM Maximilian Michels <[email protected]>
> wrote:
> > >>>>>
> > >>>>> Do you consider job submission and artifact staging part of the
> > >>>>> ReferenceRunner? If so, these parts have been reused or served as a
> > >>>>> model for the portable FlinkRunner. So they had some value.
> > >>>>>
> > >>>>> A reference implementation helps Runner authors to understand and
> reuse
> > >>>>> the code. However, I agree that the Flink implementation is more
> helpful
> > >>>>> to Runners authors than a ReferenceRunner which was designed for
> single
> > >>>>> node testing.
> > >>>>>
> > >>>>> I think there are three parts which help to push forward
> portability:
> > >>>>>
> > >>>>> 1) Good library support for new portable Runners (Java)
> > >>>>> 2) A reference implementation of a distributed Runner (Flink)
> > >>>>> 3) An easy way for users to run/test portable Pipelines (Python via
> > >>>>> FnApiRunner)
> > >>>>>
> > >>>>> The main motivation for the portability layer is supporting
> additional
> > >>>>> language to Java. Most users will be using Python, so focusing on
> a good
> > >>>>> reference Runner in Python is key.
> > >>>>>
> > >>>>> -Max
> > >>>>>
> > >>>>> On 12.02.19 10:11, Robert Bradshaw wrote:
> > >>>>> > This is certainly an interesting question, and I definitely have
> my
> > >>>>> > opinions, but am curious as to what others think as well.
> > >>>>> >
> > >>>>> > One thing that I think wasn't as clear from the outset is
> distinguishing
> > >>>>> > between the development of runners/core-java and development of
> a Java
> > >>>>> > reference runner itself. With the work on work on moving Flink to
> > >>>>> > portability, it turned out that work on the latter was not a
> > >>>>> > prerequisite for work on the former, and runners/core-java is the
> > >>>>> > artifact that other runners want to build on. I think that it is
> also
> > >>>>> > the case, as suggested, that a distributed runner's use of this
> shared
> > >>>>> > library is a better reference point (for other distributed
> runners) than
> > >>>>> > one using the direct runner (e.g. there is a much more obvious
> > >>>>> > delineation between the runner's responsibility and Beam code
> than in
> > >>>>> > the direct runner where the boundaries between orchestration,
> execution,
> > >>>>> > and other concerns are not as clear).
> > >>>>> >
> > >>>>> > As well as serving as a reference to runner implementers, the
> reference
> > >>>>> > runner can also be useful for prototyping (here I think Python
> holds an
> > >>>>> > advantage, but we're getting into subjective areas now),
> documenting (or
> > >>>>> > ideally augmenting the documentation of) the spec (here I'd say a
> > >>>>> > smaller advantage to Python, but neither runner clean,
> straightforward,
> > >>>>> > and documented enough to serve this purpose well yet), and
> serving as a
> > >>>>> > lightweight universal local runner against which to develop (and,
> > >>>>> > possibly use long term in place of a direct runner) new SDKs
> (here
> > >>>>> > you'll get a wide variety of answers whether Python or Java is
> easier to
> > >>>>> > take on as a dependency for a third language, or we could just
> package
> > >>>>> > it up in a docker image and take docker as a dependency).
> > >>>>> >
> > >>>>> > Another more pragmatic note is that one thing that helped both
> the Flink
> > >>>>> > and FnApiRunner forwards is that they were driven forward by
> actual
> > >>>>> > usecases--Lyft has actual Python (necessitating portable)
> pipelines they
> > >>>>> > want to run on Flink, and the FnApiRunner is the direct runner
> for
> > >>>>> > Python. The Java ULR (at least where it is now) sits in an
> awkward place
> > >>>>> > where its only role is to be a reference rather than be used,
> which (in
> > >>>>> > a world of limited resources) makes it harder to justify
> investment.
> > >>>>> >
> > >>>>> > - Robert
> > >>>>> >
> > >>>>> >
> > >>>>> >
> > >>>>> > On Tue, Feb 12, 2019 at 3:53 AM Kenneth Knowles <[email protected]
> > >>>>> > <mailto:[email protected]>> wrote:
> > >>>>> >
> > >>>>> >     Interesting silence here. You've got it right that the
> reason we
> > >>>>> >     initially chose Java was because of the cross-runner
> sharing. The
> > >>>>> >     reference runner could be the first target runner for any new
> > >>>>> >     feature and then its work could be directly (or indirectly
> via
> > >>>>> >     copy/paste/modify if it works better) be used in other
> runners.
> > >>>>> >     Examples:
> > >>>>> >
> > >>>>> >       - The implementations of (pre-portability) state & timers
> in
> > >>>>> >     runners/core-java and prototyped in the Java DirectRunner
> made it a
> > >>>>> >     matter of a couple of days to implement on other runners,
> and they
> > >>>>> >     saw pretty quick adoption.
> > >>>>> >       - Probably the same could be said for the first drafts of
> the
> > >>>>> >     runners, which re-used a bunch of runners/core-java and had
> each
> > >>>>> >     others' translation code as a reference.
> > >>>>> >
> > >>>>> >     I'm interested if anyone would be willing to confirm if it is
> > >>>>> >     because the FlinkRunner has forged ahead and the Dataflow
> worker is
> > >>>>> >     open source. It makes sense that the code from a distributed
> runner
> > >>>>> >     is an even better reference point if you are building another
> > >>>>> >     distributed runner. From the look of it, the SamzaRunner had
> no
> > >>>>> >     trouble getting started on portability.
> > >>>>> >
> > >>>>> >     Kenn
> > >>>>> >
> > >>>>> >     On Mon, Feb 11, 2019 at 6:04 PM Daniel Oliveira
> > >>>>> >     <[email protected] <mailto:[email protected]>>
> wrote:
> > >>>>> >
> > >>>>> >         Yeah, the FnApiRunner is what I'm leaning towards too. I
> wasn't
> > >>>>> >         sure how much demand there was for an actual reference
> > >>>>> >         implementation in Java though, so I was hoping there
> were runner
> > >>>>> >         authors that would want to chime in.
> > >>>>> >
> > >>>>> >         On the other hand, the Flink runner could serve as a
> reference
> > >>>>> >         implementation for portable features since it's further
> along,
> > >>>>> >         so maybe it's not an issue regardless.
> > >>>>> >
> > >>>>> >         On Mon, Feb 11, 2019 at 1:09 PM Sam Rohde <
> [email protected]
> > >>>>> >         <mailto:[email protected]>> wrote:
> > >>>>> >
> > >>>>> >             Thanks for starting this thread. If I had to guess,
> I would
> > >>>>> >             say there is more of a demand for Python as it's
> more widely
> > >>>>> >             used for data scientists/ analytics. Being
> pragmatic, the
> > >>>>> >             FnApiRunner already has more feature work than the
> Java so
> > >>>>> >             we should go with that.
> > >>>>> >
> > >>>>> >             -Sam
> > >>>>> >
> > >>>>> >             On Fri, Feb 8, 2019 at 10:07 AM Daniel Oliveira
> > >>>>> >             <[email protected] <mailto:
> [email protected]>> wrote:
> > >>>>> >
> > >>>>> >                 Hello Beam dev community,
> > >>>>> >
> > >>>>> >                 For those who don't know me, I work for Google
> and I've
> > >>>>> >                 been working on the Java reference runner, which
> is a
> > >>>>> >                 portable, local Java runner (it's basically the
> direct
> > >>>>> >                 runner with the portability APIs implemented).
> Our goal
> > >>>>> >                 in working on this was to have a portable runner
> which
> > >>>>> >                 ran locally so it could be used by users for
> testing
> > >>>>> >                 portable pipelines, devs for testing new
> features with
> > >>>>> >                 portability, and for runner authors to provide a
> simple
> > >>>>> >                 reference implementation of a portable runner.
> > >>>>> >
> > >>>>> >                 Due to various circumstances though, progress on
> the
> > >>>>> >                 Java reference runner has been pretty slow, and
> a Python
> > >>>>> >                 runner which does pretty much the same things
> was made
> > >>>>> >                 to aid portability development in Python (called
> the
> > >>>>> >                 FnApiRunner). This runner is currently further
> along in
> > >>>>> >                 feature work than the Java reference runner, so
> we've
> > >>>>> >                 been reevaluating if we should switch to
> investing in it
> > >>>>> >                 instead.
> > >>>>> >
> > >>>>> >                 My question to the community is: Which runner do
> you
> > >>>>> >                 think would be more valuable to the dev
> community and
> > >>>>> >                 Beam users? For those of you who are runner
> authors, do
> > >>>>> >                 you have a preference for what language you'd
> like to
> > >>>>> >                 see a reference implementation in?
> > >>>>> >
> > >>>>> >                 Thanks,
> > >>>>> >                 Daniel Oliveira
> > >>>>> >
>

Re: Thoughts on a reference runner to invest in?

Reply via email to