Re: [Discuss] Upgrade story for Beam's execution engines

Thomas Weise Thu, 13 Sep 2018 10:35:09 -0700

On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <[email protected]> wrote:


> Thank you for your comments. Let me try to summarize what has been
> discussed so far:
>
> 1. The Beam LTS version will ensure a stable execution engine for as
> long as the LTS life span.
>

If I understand the LTS proposal correctly, then it will be a release line
that continues to receive patches (as in semantic versioning), but no new
features as that would defeat the purpose (stability).

If so, then I don't think LTS matters for this discussion.

2. We agree that pushing updates to the execution engine for the Runners
> is only desirable if it results in a better integration with the Beam
> model or if it is necessary due security or performance reasons.
>
> 3. We might have to consider adding additional build targets for a
> Runner for whenever the execution engine gets upgraded. This might be
> really easy if the engine's API remains stable. It might also be
> desirable if the upgrade path is not easy and not completely
> foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner. The
> Beam feature set could vary depending on the version.
>

To limit the pain of dealing with incompatible runner changes and copies
within Beam, we should probably also work with the respective community to
improve the compatibility story.


> 4. In the long run, we want a stable abstraction layer for each Runner
> that, ideally, is maintained by the upstream of the execution engine. In
> the short run, this is probably not realistic, as the shared libraries
> of Beam are not stable enough.
>

Yes, that will only become an option once we reach interface stability.
Similar to how the runner projects maintain their IO connectors.


> On 13.09.18 14:39, Robert Bradshaw wrote:
> > The ideal long-term solution is, as Romain mentions, pushing the
> > runner-specific code up to be maintained by each runner with a stable
> > API to use to talk to Beam. Unfortunately, I think we're still a long
> > way from having this Stable API, or having the clout for
> > non-beam-developers to maintain these bindings externally (though
> > hopefully we'll get there).
> >
> > In the short term, we're stuck with either hurting users that want to
> > stick with Flink 1.5, hurting users that want to upgrade to Flink 1.6,
> > or supporting both. Is Beam's interaction with Flink such that we can't
> > simply have separate targets linking the same Beam code against one or
> > the other? (I.e. are code changes needed?) If so, we'll probably need a
> > flink-runner-1.5 module, a flink-runner-1.6, and a flink-runner-common
> > module. Or we hope that all users are happy with 1.5 until a certain
> > point in time when they all want to simultaneously jump to 1.6 and Beam
> > at the same time. Maybe that's enough in the short term, but longer term
> > we need a more sustainable solution.
> >
> >
> > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> > <[email protected] <mailto:[email protected]>> wrote:
> >
> >     Hi guys,
> >
> >     Isnt the issue "only" that beam has this code instead of engines?
> >
> >     Assuming beam runner facing api is stable - which must be the case
> >     anyway - and that each engine has its integration (flink-beam
> >     instead of beam-runners-flink), then this issue disappears by
> >     construction.
> >
> >     It also has the advantage to have a better maintenance.
> >
> >     Side note: this is what happent which arquillian, originally the
> >     community did all adapters impl then each vendor took it back in
> >     house to make it better.
> >
> >     Any way to work in that direction maybe?
> >
> >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <[email protected]
> >     <mailto:[email protected]>> a écrit :
> >
> >         The main problem here is that users are forced to upgrade
> >         infrastructure to obtain new features in Beam, even when those
> >         features actually don't require such changes. As an example,
> >         another update to Flink 1.6.0 was proposed (without supporting
> >         new functionality in Beam) and we already know that it breaks
> >         compatibility (again).
> >
> >         I think that upgrading to a Flink X.Y.0 version isn't a good
> >         idea to start with. But besides that, if we want to grow
> >         adoption, then we need to focus on stability and delivering
> >         improvements to Beam without disrupting users.
> >
> >         In the specific case, ideally the surface of Flink would be
> >         backward compatible, allowing us to stick to a minimum version
> >         and be able to submit pipelines to Flink endpoints of higher
> >         versions. Some work in that direction is underway (like
> >         versioning the REST API). FYI, lowest common version is what
> >         most projects that depend on Hadoop 2.x follow.
> >
> >         Since Beam with Flink 1.5.x client won't talk to Flink 1.6 and
> >         there are code changes required to make it compile, we would
> >         need to come up with a more involved strategy to support
> >         multiple Flink versions. Till then, I would prefer we favor
> >         existing users over short lived experiments, which would mean
> >         stick with 1.5.x and not support 1.6.0.
> >
> >         Thanks,
> >         Thomas
> >
> >
> >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik <[email protected]
> >         <mailto:[email protected]>> wrote:
> >
> >             As others have already suggested, I also believe LTS
> >             releases is the best we can do as a community right now
> >             until portability allows us to decouple what a user writes
> >             with and how it runs (the SDK and the SDK environment) from
> >             the runner (job service + shared common runner libs +
> >             Flink/Spark/Dataflow/Apex/Samza/...).
> >
> >             Dataflow would be highly invested in having the appropriate
> >             tooling within Apache Beam to support multiple SDK versions
> >             against a runner. This in turn would allow people to use any
> >             SDK with any runner and as Robert had mentioned, certain
> >             optimizations and features would be disabled depending on
> >             the capabilities of the runner and the capabilities of the
> SDK.
> >
> >
> >
> >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >             <[email protected] <mailto:[email protected]>> wrote:
> >
> >                 The target audience is people who want to use the latest
> >                 Beam but do not want to use the latest version of the
> >                 runner, right?
> >
> >                 I think this will be somewhat (though not entirely)
> >                 addressed by Beam LTS releases, where those not wanting
> >                 to upgrade the runner at least have a well-supported
> >                 version of Beam. In the long term, we have the division
> >
> >                      Runner <-> BeamRunnerSpecificCode <->
> >                 CommonBeamRunnerLibs <-> SDK.
> >
> >                 (which applies to the job submission as well as
> execution).
> >
> >                 Insomuch as the BeamRunnerSpecificCode uses the public
> >                 APIs of the runner, hopefully upgrading the runner for
> >                 minor versions should be a no-op, and we can target the
> >                 lowest version of the runner that makes sense, allowing
> >                 the user to link against higher versions at his or her
> >                 discretion. We should provide built targets that allow
> >                 this. For major versions, it may make sense to have two
> >                 distinct BeamRunnerSpecificCode libraries (which may or
> >                 may not share some common code). I hope these wrappers
> >                 are not too thick.
> >
> >                 There is a tight coupling at the BeamRunnerSpecificCode
> >                 <-> CommonBeamRunnerLibs layer, but hopefully the bulk
> >                 of the code lives on the right hand side and can be
> >                 updated as needed independent of the runner. There may
> >                 be code of the form "if the runner supports X, do this
> >                 fast path, otherwise, do this slow path (or reject the
> >                 pipeline).
> >
> >                 I hope the CommonBeamRunnerLibs <-> SDK coupling is
> >                 fairly loose, to the point that one could use SDKs from
> >                 different versions of Beam (or even developed outside of
> >                 Beam) with an older/newer runner. We may need to add
> >                 versioning to the Fn/Runner/Job API itself to support
> >                 this. Right now of course we're still in a pre-1.0,
> >                 rapid-development phase wrt this API.
> >
> >
> >
> >
> >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >                 <[email protected] <mailto:[email protected]>>
> wrote:
> >
> >                     Hi Max,
> >
> >                     I totally agree with your points especially the
> >                     users priorities (stick to the already working
> >                     version) , and the need to leverage important new
> >                     features. It is indeed a difficult balance to find .
> >
> >                     I can talk for a part I know: for the Spark runner,
> >                     the aim was to support Dataset native spark API (in
> >                     place of RDD). For that we needed to upgrade to
> >                     spark 2.x (and we will probably leverage Beam Row as
> >                     well).
> >                     But such an upgrade is a good amount of work which
> >                     makes it difficult to commit on a schedule such as
> >                     "if there is a major new feature on an execution
> >                     engine that we want to leverage, then the upgrade in
> >                     Beam will be done within x months".
> >
> >                     Regarding your point on portability : decoupling SDK
> >                     from runner with runner harness and SDK harness
> >                     might make pipeline authors work easy regarding
> >                     pipeline maintenance. But, still, if we upgrade
> >                     runner libs, then the users might have their runner
> >                     harness not work with their engine version.
> >                     If such SDK/runner decoupling is 100% functional,
> >                     then we could imaging having multiple runner
> >                     harnesses shipping different versions of the runner
> >                     libs to solve this problem.
> >                     But we would need to support more than one version
> >                     of the runner libs. We chose not to do this on spark
> >                     runner.
> >
> >                     WDYT ?
> >
> >                     Best
> >                     Etienne
> >
> >
> >                     Le mardi 11 septembre 2018 à 15:42 +0200, Maximilian
> >                     Michels a écrit :
> >>                     Hi Beamers,
> >>
> >>                     In the light of the discussion about Beam LTS
> releases, I'd like to kick
> >>                     off a thread about how often we upgrade the
> execution engine of each
> >>                     Runner. By upgrade, I mean major/minor versions
> which typically break
> >>                     the binary compatibility of Beam pipelines.
> >>
> >>                     For the Flink Runner, we try to track the latest
> stable version. Some
> >>                     users reported that this can be problematic, as it
> requires them to
> >>                     potentially upgrade their Flink cluster with a new
> version of Beam.
> >>
> >>                       From a developer's perspective, it makes sense to
> migrate as early as
> >>                     possible to the newest version of the execution
> engine, e.g. to leverage
> >>                     the newest features. From a user's perspective, you
> don't care about the
> >>                     latest features if your use case still works with
> Beam.
> >>
> >>                     We have to please both parties. So I'd suggest to
> upgrade the execution
> >>                     engine whenever necessary (e.g. critical new
> features, end of life of
> >>                     current version). On the other hand, the upcoming
> Beam LTS releases will
> >>                     contain a longer-supported version.
> >>
> >>                     Maybe we don't need to discuss much about this but
> I wanted to hear what
> >>                     the community has to say about it. Particularly,
> I'd be interested in
> >>                     how the other Runner authors intend to do it.
> >>
> >>                     As far as I understand, with the portability being
> stable, we could
> >>                     theoretically upgrade the SDK without upgrading the
> runtime components.
> >>                     That would allow us to defer the upgrade for a
> longer time.
> >>
> >>                     Best,
> >>                     Max
> >>
>

Re: [Discuss] Upgrade story for Beam's execution engines

Reply via email to