Re: [Discuss] Upgrade story for Beam's execution engines

Austin Bennett Sun, 16 Sep 2018 17:03:01 -0700

Do we currently maintain a finer grained list of compatibility between
execution/runner versions and beam versions?  Is this only really a concern
with recent Flink (sounded like at least Spark jump, too)?  I see the
capability matrix:
https://beam.apache.org/documentation/runners/capability-matrix/, but some
sort of compatibility between runner versions with beam releases might be
useful.


I see compatibility matrix as far as beam features, but not for underlying
runners.  Ex: something like this would save a user trying to get Beam
working on recent Flink 1.6 and then subsequently hitting a (potentially
not well documented) wall given known issues.



On Sun, Sep 16, 2018 at 3:59 AM Maximilian Michels <[email protected]> wrote:

> > If I understand the LTS proposal correctly, then it will be a release
> line that continues to receive patches (as in semantic versioning), but no
> new features as that would defeat the purpose (stability).
>
> It matters insofar, as execution engine upgrades could be performed in
> the master but the LTS version won't receive them. So LTS is the go-to
> if you want to ensure compatibility with your existing setup.
>
> > To limit the pain of dealing with incompatible runner changes and copies
> within Beam, we should probably also work with the respective community to
> improve the compatibility story.
>
> Absolutely. If we find that we can improve compatibility with upstream
> changes, we should go that path. Even if we don't have a dedicated
> compatibility layer upstream yet.
>
> On 13.09.18 19:34, Thomas Weise wrote:
> >
> > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Thank you for your comments. Let me try to summarize what has been
> >     discussed so far:
> >
> >     1. The Beam LTS version will ensure a stable execution engine for as
> >     long as the LTS life span.
> >
> >
> > If I understand the LTS proposal correctly, then it will be a release
> > line that continues to receive patches (as in semantic versioning), but
> > no new features as that would defeat the purpose (stability).
> >
> > If so, then I don't think LTS matters for this discussion.
> >
> >     2. We agree that pushing updates to the execution engine for the
> >     Runners
> >     is only desirable if it results in a better integration with the Beam
> >     model or if it is necessary due security or performance reasons.
> >
> >     3. We might have to consider adding additional build targets for a
> >     Runner for whenever the execution engine gets upgraded. This might be
> >     really easy if the engine's API remains stable. It might also be
> >     desirable if the upgrade path is not easy and not completely
> >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner.
> The
> >     Beam feature set could vary depending on the version.
> >
> >
> > To limit the pain of dealing with incompatible runner changes and copies
> > within Beam, we should probably also work with the respective community
> > to improve the compatibility story.
> >
> >
> >     4. In the long run, we want a stable abstraction layer for each
> Runner
> >     that, ideally, is maintained by the upstream of the execution
> >     engine. In
> >     the short run, this is probably not realistic, as the shared
> libraries
> >     of Beam are not stable enough.
> >
> >
> > Yes, that will only become an option once we reach interface stability.
> > Similar to how the runner projects maintain their IO connectors.
> >
> >     On 13.09.18 14:39, Robert Bradshaw wrote:
> >      > The ideal long-term solution is, as Romain mentions, pushing the
> >      > runner-specific code up to be maintained by each runner with a
> >     stable
> >      > API to use to talk to Beam. Unfortunately, I think we're still a
> >     long
> >      > way from having this Stable API, or having the clout for
> >      > non-beam-developers to maintain these bindings externally (though
> >      > hopefully we'll get there).
> >      >
> >      > In the short term, we're stuck with either hurting users that
> >     want to
> >      > stick with Flink 1.5, hurting users that want to upgrade to Flink
> >     1.6,
> >      > or supporting both. Is Beam's interaction with Flink such that we
> >     can't
> >      > simply have separate targets linking the same Beam code against
> >     one or
> >      > the other? (I.e. are code changes needed?) If so, we'll probably
> >     need a
> >      > flink-runner-1.5 module, a flink-runner-1.6, and a
> >     flink-runner-common
> >      > module. Or we hope that all users are happy with 1.5 until a
> certain
> >      > point in time when they all want to simultaneously jump to 1.6
> >     and Beam
> >      > at the same time. Maybe that's enough in the short term, but
> >     longer term
> >      > we need a more sustainable solution.
> >      >
> >      >
> >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
> >      > <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >      >
> >      >     Hi guys,
> >      >
> >      >     Isnt the issue "only" that beam has this code instead of
> engines?
> >      >
> >      >     Assuming beam runner facing api is stable - which must be the
> >     case
> >      >     anyway - and that each engine has its integration (flink-beam
> >      >     instead of beam-runners-flink), then this issue disappears by
> >      >     construction.
> >      >
> >      >     It also has the advantage to have a better maintenance.
> >      >
> >      >     Side note: this is what happent which arquillian, originally
> the
> >      >     community did all adapters impl then each vendor took it back
> in
> >      >     house to make it better.
> >      >
> >      >     Any way to work in that direction maybe?
> >      >
> >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise <[email protected]
> >     <mailto:[email protected]>
> >      >     <mailto:[email protected] <mailto:[email protected]>>> a écrit :
> >      >
> >      >         The main problem here is that users are forced to upgrade
> >      >         infrastructure to obtain new features in Beam, even when
> >     those
> >      >         features actually don't require such changes. As an
> example,
> >      >         another update to Flink 1.6.0 was proposed (without
> >     supporting
> >      >         new functionality in Beam) and we already know that it
> breaks
> >      >         compatibility (again).
> >      >
> >      >         I think that upgrading to a Flink X.Y.0 version isn't a
> good
> >      >         idea to start with. But besides that, if we want to grow
> >      >         adoption, then we need to focus on stability and
> delivering
> >      >         improvements to Beam without disrupting users.
> >      >
> >      >         In the specific case, ideally the surface of Flink would
> be
> >      >         backward compatible, allowing us to stick to a minimum
> >     version
> >      >         and be able to submit pipelines to Flink endpoints of
> higher
> >      >         versions. Some work in that direction is underway (like
> >      >         versioning the REST API). FYI, lowest common version is
> what
> >      >         most projects that depend on Hadoop 2.x follow.
> >      >
> >      >         Since Beam with Flink 1.5.x client won't talk to Flink
> >     1.6 and
> >      >         there are code changes required to make it compile, we
> would
> >      >         need to come up with a more involved strategy to support
> >      >         multiple Flink versions. Till then, I would prefer we
> favor
> >      >         existing users over short lived experiments, which would
> mean
> >      >         stick with 1.5.x and not support 1.6.0.
> >      >
> >      >         Thanks,
> >      >         Thomas
> >      >
> >      >
> >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
> >     <[email protected] <mailto:[email protected]>
> >      >         <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >      >
> >      >             As others have already suggested, I also believe LTS
> >      >             releases is the best we can do as a community right
> now
> >      >             until portability allows us to decouple what a user
> >     writes
> >      >             with and how it runs (the SDK and the SDK
> >     environment) from
> >      >             the runner (job service + shared common runner libs +
> >      >             Flink/Spark/Dataflow/Apex/Samza/...).
> >      >
> >      >             Dataflow would be highly invested in having the
> >     appropriate
> >      >             tooling within Apache Beam to support multiple SDK
> >     versions
> >      >             against a runner. This in turn would allow people to
> >     use any
> >      >             SDK with any runner and as Robert had mentioned,
> certain
> >      >             optimizations and features would be disabled
> depending on
> >      >             the capabilities of the runner and the capabilities
> >     of the SDK.
> >      >
> >      >
> >      >
> >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
> >      >             <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >      >
> >      >                 The target audience is people who want to use the
> >     latest
> >      >                 Beam but do not want to use the latest version of
> the
> >      >                 runner, right?
> >      >
> >      >                 I think this will be somewhat (though not
> entirely)
> >      >                 addressed by Beam LTS releases, where those not
> >     wanting
> >      >                 to upgrade the runner at least have a
> well-supported
> >      >                 version of Beam. In the long term, we have the
> >     division
> >      >
> >      >                      Runner <-> BeamRunnerSpecificCode <->
> >      >                 CommonBeamRunnerLibs <-> SDK.
> >      >
> >      >                 (which applies to the job submission as well as
> >     execution).
> >      >
> >      >                 Insomuch as the BeamRunnerSpecificCode uses the
> >     public
> >      >                 APIs of the runner, hopefully upgrading the
> >     runner for
> >      >                 minor versions should be a no-op, and we can
> >     target the
> >      >                 lowest version of the runner that makes sense,
> >     allowing
> >      >                 the user to link against higher versions at his
> >     or her
> >      >                 discretion. We should provide built targets that
> >     allow
> >      >                 this. For major versions, it may make sense to
> >     have two
> >      >                 distinct BeamRunnerSpecificCode libraries (which
> >     may or
> >      >                 may not share some common code). I hope these
> >     wrappers
> >      >                 are not too thick.
> >      >
> >      >                 There is a tight coupling at
> >     the BeamRunnerSpecificCode
> >      >                 <-> CommonBeamRunnerLibs layer, but hopefully the
> >     bulk
> >      >                 of the code lives on the right hand side and can
> be
> >      >                 updated as needed independent of the runner.
> >     There may
> >      >                 be code of the form "if the runner supports X, do
> >     this
> >      >                 fast path, otherwise, do this slow path (or
> >     reject the
> >      >                 pipeline).
> >      >
> >      >                 I hope the CommonBeamRunnerLibs <-> SDK coupling
> is
> >      >                 fairly loose, to the point that one could use
> >     SDKs from
> >      >                 different versions of Beam (or even developed
> >     outside of
> >      >                 Beam) with an older/newer runner. We may need to
> add
> >      >                 versioning to the Fn/Runner/Job API itself to
> support
> >      >                 this. Right now of course we're still in a
> pre-1.0,
> >      >                 rapid-development phase wrt this API.
> >      >
> >      >
> >      >
> >      >
> >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot
> >      >                 <[email protected]
> >     <mailto:[email protected]> <mailto:[email protected]
> >     <mailto:[email protected]>>> wrote:
> >      >
> >      >                     Hi Max,
> >      >
> >      >                     I totally agree with your points especially
> the
> >      >                     users priorities (stick to the already working
> >      >                     version) , and the need to leverage important
> new
> >      >                     features. It is indeed a difficult balance to
> >     find .
> >      >
> >      >                     I can talk for a part I know: for the Spark
> >     runner,
> >      >                     the aim was to support Dataset native spark
> >     API (in
> >      >                     place of RDD). For that we needed to upgrade
> to
> >      >                     spark 2.x (and we will probably leverage Beam
> >     Row as
> >      >                     well).
> >      >                     But such an upgrade is a good amount of work
> >     which
> >      >                     makes it difficult to commit on a schedule
> >     such as
> >      >                     "if there is a major new feature on an
> execution
> >      >                     engine that we want to leverage, then the
> >     upgrade in
> >      >                     Beam will be done within x months".
> >      >
> >      >                     Regarding your point on portability :
> >     decoupling SDK
> >      >                     from runner with runner harness and SDK
> harness
> >      >                     might make pipeline authors work easy
> regarding
> >      >                     pipeline maintenance. But, still, if we
> upgrade
> >      >                     runner libs, then the users might have their
> >     runner
> >      >                     harness not work with their engine version.
> >      >                     If such SDK/runner decoupling is 100%
> functional,
> >      >                     then we could imaging having multiple runner
> >      >                     harnesses shipping different versions of the
> >     runner
> >      >                     libs to solve this problem.
> >      >                     But we would need to support more than one
> >     version
> >      >                     of the runner libs. We chose not to do this
> >     on spark
> >      >                     runner.
> >      >
> >      >                     WDYT ?
> >      >
> >      >                     Best
> >      >                     Etienne
> >      >
> >      >
> >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
> >     Maximilian
> >      >                     Michels a écrit :
> >      >>                     Hi Beamers,
> >      >>
> >      >>                     In the light of the discussion about Beam
> >     LTS releases, I'd like to kick
> >      >>                     off a thread about how often we upgrade the
> >     execution engine of each
> >      >>                     Runner. By upgrade, I mean major/minor
> >     versions which typically break
> >      >>                     the binary compatibility of Beam pipelines.
> >      >>
> >      >>                     For the Flink Runner, we try to track the
> >     latest stable version. Some
> >      >>                     users reported that this can be problematic,
> >     as it requires them to
> >      >>                     potentially upgrade their Flink cluster with
> >     a new version of Beam.
> >      >>
> >      >>                       From a developer's perspective, it makes
> >     sense to migrate as early as
> >      >>                     possible to the newest version of the
> >     execution engine, e.g. to leverage
> >      >>                     the newest features. From a user's
> >     perspective, you don't care about the
> >      >>                     latest features if your use case still works
> >     with Beam.
> >      >>
> >      >>                     We have to please both parties. So I'd
> >     suggest to upgrade the execution
> >      >>                     engine whenever necessary (e.g. critical new
> >     features, end of life of
> >      >>                     current version). On the other hand, the
> >     upcoming Beam LTS releases will
> >      >>                     contain a longer-supported version.
> >      >>
> >      >>                     Maybe we don't need to discuss much about
> >     this but I wanted to hear what
> >      >>                     the community has to say about it.
> >     Particularly, I'd be interested in
> >      >>                     how the other Runner authors intend to do it.
> >      >>
> >      >>                     As far as I understand, with the portability
> >     being stable, we could
> >      >>                     theoretically upgrade the SDK without
> >     upgrading the runtime components.
> >      >>                     That would allow us to defer the upgrade for
> >     a longer time.
> >      >>
> >      >>                     Best,
> >      >>                     Max
> >      >>
> >
>

Re: [Discuss] Upgrade story for Beam's execution engines

Reply via email to