Re: [Discuss] Upgrade story for Beam's execution engines

Maximilian Michels Mon, 17 Sep 2018 06:30:37 -0700

> To clarify, there are two classes of OldRunner users here: those thatwant new features in Beam, and those that simply want to run a supportedversion of Beam. The LTS proposal helps the latter, which is going to bebiased towards those not upgrading their runners. The former are worthsupporting as well.

Agree. Apart from the LTS, we want to minimize upgrade pain for new Beamversions as much as possible.


On 17.09.18 09:30, Robert Bradshaw wrote:

On Sun, Sep 16, 2018 at 12:59 PM Maximilian Michels <[email protected]<mailto:[email protected]>> wrote:


     > If I understand the LTS proposal correctly, then it will be a
    release line that continues to receive patches (as in semantic
    versioning), but no new features as that would defeat the purpose
    (stability).

    It matters insofar, as execution engine upgrades could be performed in
    the master but the LTS version won't receive them. So LTS is the go-to
    if you want to ensure compatibility with your existing setup.

To clarify, there are two classes of OldRunner users here: those thatwant new features in Beam, and those that simply want to run a supportedversion of Beam. The LTS proposal helps the latter, which is going to bebiased towards those not upgrading their runners. The former are worthsupporting as well.


     > To limit the pain of dealing with incompatible runner changes and
    copies within Beam, we should probably also work with the respective
    community to improve the compatibility story.

    Absolutely. If we find that we can improve compatibility with upstream
    changes, we should go that path. Even if we don't have a dedicated
    compatibility layer upstream yet.

    On 13.09.18 19:34, Thomas Weise wrote:
     >
     > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels
    <[email protected] <mailto:[email protected]>
     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
     >
     >     Thank you for your comments. Let me try to summarize what has
    been
     >     discussed so far:
     >
     >     1. The Beam LTS version will ensure a stable execution engine
    for as
     >     long as the LTS life span.
     >
     >
     > If I understand the LTS proposal correctly, then it will be a
    release
     > line that continues to receive patches (as in semantic
    versioning), but
     > no new features as that would defeat the purpose (stability).
     >
     > If so, then I don't think LTS matters for this discussion.
     >
     >     2. We agree that pushing updates to the execution engine for the
     >     Runners
     >     is only desirable if it results in a better integration with
    the Beam
     >     model or if it is necessary due security or performance reasons.
     >
     >     3. We might have to consider adding additional build targets
    for a
     >     Runner for whenever the execution engine gets upgraded. This
    might be
     >     really easy if the engine's API remains stable. It might also be
     >     desirable if the upgrade path is not easy and not completely
     >     foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x
    Runner. The
     >     Beam feature set could vary depending on the version.
     >
     >
     > To limit the pain of dealing with incompatible runner changes and
    copies
     > within Beam, we should probably also work with the respective
    community
     > to improve the compatibility story.
     >
     >
     >     4. In the long run, we want a stable abstraction layer for
    each Runner
     >     that, ideally, is maintained by the upstream of the execution
     >     engine. In
     >     the short run, this is probably not realistic, as the shared
    libraries
     >     of Beam are not stable enough.
     >
     >
     > Yes, that will only become an option once we reach interface
    stability.
     > Similar to how the runner projects maintain their IO connectors.
     >
     >     On 13.09.18 14:39, Robert Bradshaw wrote:
     >      > The ideal long-term solution is, as Romain mentions,
    pushing the
     >      > runner-specific code up to be maintained by each runner with a
     >     stable
     >      > API to use to talk to Beam. Unfortunately, I think we're
    still a
     >     long
     >      > way from having this Stable API, or having the clout for
     >      > non-beam-developers to maintain these bindings externally
    (though
     >      > hopefully we'll get there).
     >      >
     >      > In the short term, we're stuck with either hurting users that
     >     want to
     >      > stick with Flink 1.5, hurting users that want to upgrade
    to Flink
     >     1.6,
     >      > or supporting both. Is Beam's interaction with Flink such
    that we
     >     can't
     >      > simply have separate targets linking the same Beam code
    against
     >     one or
     >      > the other? (I.e. are code changes needed?) If so, we'll
    probably
     >     need a
     >      > flink-runner-1.5 module, a flink-runner-1.6, and a
     >     flink-runner-common
     >      > module. Or we hope that all users are happy with 1.5 until
    a certain
     >      > point in time when they all want to simultaneously jump to 1.6
     >     and Beam
     >      > at the same time. Maybe that's enough in the short term, but
     >     longer term
     >      > we need a more sustainable solution.
     >      >
     >      >
     >      > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau
     >      > <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >     <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>> wrote:
     >      >
     >      >     Hi guys,
     >      >
     >      >     Isnt the issue "only" that beam has this code instead
    of engines?
     >      >
     >      >     Assuming beam runner facing api is stable - which must
    be the
     >     case
     >      >     anyway - and that each engine has its integration
    (flink-beam
     >      >     instead of beam-runners-flink), then this issue
    disappears by
     >      >     construction.
     >      >
     >      >     It also has the advantage to have a better maintenance.
     >      >
     >      >     Side note: this is what happent which arquillian,
    originally the
     >      >     community did all adapters impl then each vendor took
    it back in
     >      >     house to make it better.
     >      >
     >      >     Any way to work in that direction maybe?
     >      >
     >      >     Le jeu. 13 sept. 2018 00:49, Thomas Weise
    <[email protected] <mailto:[email protected]>
     >     <mailto:[email protected] <mailto:[email protected]>>
     >      >     <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>> a écrit :
     >      >
     >      >         The main problem here is that users are forced to
    upgrade
     >      >         infrastructure to obtain new features in Beam,
    even when
     >     those
     >      >         features actually don't require such changes. As
    an example,
     >      >         another update to Flink 1.6.0 was proposed (without
     >     supporting
     >      >         new functionality in Beam) and we already know
    that it breaks
     >      >         compatibility (again).
     >      >
     >      >         I think that upgrading to a Flink X.Y.0 version
    isn't a good
     >      >         idea to start with. But besides that, if we want
    to grow
     >      >         adoption, then we need to focus on stability and
    delivering
     >      >         improvements to Beam without disrupting users.
     >      >
     >      >         In the specific case, ideally the surface of Flink
    would be
     >      >         backward compatible, allowing us to stick to a minimum
     >     version
     >      >         and be able to submit pipelines to Flink endpoints
    of higher
     >      >         versions. Some work in that direction is underway
    (like
     >      >         versioning the REST API). FYI, lowest common
    version is what
     >      >         most projects that depend on Hadoop 2.x follow.
     >      >
     >      >         Since Beam with Flink 1.5.x client won't talk to Flink
     >     1.6 and
     >      >         there are code changes required to make it
    compile, we would
     >      >         need to come up with a more involved strategy to
    support
     >      >         multiple Flink versions. Till then, I would prefer
    we favor
     >      >         existing users over short lived experiments, which
    would mean
     >      >         stick with 1.5.x and not support 1.6.0.
     >      >
     >      >         Thanks,
     >      >         Thomas
     >      >
     >      >
     >      >         On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik
     >     <[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>
     >      >         <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>> wrote:
     >      >
     >      >             As others have already suggested, I also
    believe LTS
     >      >             releases is the best we can do as a community
    right now
     >      >             until portability allows us to decouple what a
    user
     >     writes
     >      >             with and how it runs (the SDK and the SDK
     >     environment) from
     >      >             the runner (job service + shared common runner
    libs +
     >      >             Flink/Spark/Dataflow/Apex/Samza/...).
     >      >
     >      >             Dataflow would be highly invested in having the
     >     appropriate
     >      >             tooling within Apache Beam to support multiple SDK
     >     versions
     >      >             against a runner. This in turn would allow
    people to
     >     use any
     >      >             SDK with any runner and as Robert had
    mentioned, certain
     >      >             optimizations and features would be disabled
    depending on
     >      >             the capabilities of the runner and the
    capabilities
     >     of the SDK.
     >      >
     >      >
     >      >
     >      >             On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw
     >      >             <[email protected]
    <mailto:[email protected]> <mailto:[email protected]
    <mailto:[email protected]>>
     >     <mailto:[email protected] <mailto:[email protected]>
    <mailto:[email protected] <mailto:[email protected]>>>> wrote:
     >      >
     >      >                 The target audience is people who want to
    use the
     >     latest
     >      >                 Beam but do not want to use the
    latest version of the
     >      >                 runner, right?
     >      >
     >      >                 I think this will be somewhat (though not
    entirely)
     >      >                 addressed by Beam LTS releases, where
    those not
     >     wanting
     >      >                 to upgrade the runner at least have a
    well-supported
     >      >                 version of Beam. In the long term, we have the
     >     division
     >      >
     >      >                      Runner <-> BeamRunnerSpecificCode <->
     >      >                 CommonBeamRunnerLibs <-> SDK.
     >      >
     >      >                 (which applies to the job submission as
    well as
     >     execution).
     >      >
     >      >                 Insomuch as the BeamRunnerSpecificCode
    uses the
     >     public
     >      >                 APIs of the runner, hopefully upgrading the
     >     runner for
     >      >                 minor versions should be a no-op, and we can
     >     target the
     >      >                 lowest version of the runner that makes sense,
     >     allowing
     >      >                 the user to link against higher versions
    at his
     >     or her
     >      >                 discretion. We should provide built
    targets that
     >     allow
     >      >                 this. For major versions, it may make sense to
     >     have two
     >      >                 distinct BeamRunnerSpecificCode libraries
    (which
     >     may or
     >      >                 may not share some common code). I hope these
     >     wrappers
     >      >                 are not too thick.
     >      >
     >      >                 There is a tight coupling at
     >     the BeamRunnerSpecificCode
     >      >                 <-> CommonBeamRunnerLibs layer, but
    hopefully the
     >     bulk
     >      >                 of the code lives on the right hand side
    and can be
     >      >                 updated as needed independent of the runner.
     >     There may
     >      >                 be code of the form "if the runner
    supports X, do
     >     this
     >      >                 fast path, otherwise, do this slow path (or
     >     reject the
     >      >                 pipeline).
     >      >
     >      >                 I hope the CommonBeamRunnerLibs <-> SDK
    coupling is
     >      >                 fairly loose, to the point that one could use
     >     SDKs from
     >      >                 different versions of Beam (or even developed
     >     outside of
     >      >                 Beam) with an older/newer runner. We may
    need to add
     >      >                 versioning to the Fn/Runner/Job API itself
    to support
     >      >                 this. Right now of course we're still in a
    pre-1.0,
     >      >                 rapid-development phase wrt this API.
     >      >
     >      >
     >      >
     >      >
     >      >                 On Wed, Sep 12, 2018 at 2:10 PM Etienne
    Chauchot
     >      >                 <[email protected]
    <mailto:[email protected]>
     >     <mailto:[email protected] <mailto:[email protected]>>
    <mailto:[email protected] <mailto:[email protected]>
     >     <mailto:[email protected] <mailto:[email protected]>>>>
    wrote:
     >      >
     >      >                     Hi Max,
     >      >
     >      >                     I totally agree with your points
    especially the
     >      >                     users priorities (stick to the already
    working
     >      >                     version) , and the need to leverage
    important new
     >      >                     features. It is indeed a difficult
    balance to
     >     find .
     >      >
     >      >                     I can talk for a part I know: for the
    Spark
     >     runner,
     >      >                     the aim was to support Dataset native
    spark
     >     API (in
     >      >                     place of RDD). For that we needed to
    upgrade to
     >      >                     spark 2.x (and we will probably
    leverage Beam
     >     Row as
     >      >                     well).
     >      >                     But such an upgrade is a good amount
    of work
     >     which
     >      >                     makes it difficult to commit on a schedule
     >     such as
     >      >                     "if there is a major new feature on an
    execution
     >      >                     engine that we want to leverage, then the
     >     upgrade in
     >      >                     Beam will be done within x months".
     >      >
     >      >                     Regarding your point on portability :
     >     decoupling SDK
     >      >                     from runner with runner harness and
    SDK harness
     >      >                     might make pipeline authors work easy
    regarding
     >      >                     pipeline maintenance. But, still, if
    we upgrade
     >      >                     runner libs, then the users might have
    their
     >     runner
     >      >                     harness not work with their engine
    version.
     >      >                     If such SDK/runner decoupling is 100%
    functional,
     >      >                     then we could imaging having multiple
    runner
     >      >                     harnesses shipping different versions
    of the
     >     runner
     >      >                     libs to solve this problem.
     >      >                     But we would need to support more than one
     >     version
     >      >                     of the runner libs. We chose not to do
    this
     >     on spark
     >      >                     runner.
     >      >
     >      >                     WDYT ?
     >      >
     >      >                     Best
     >      >                     Etienne
     >      >
     >      >
     >      >                     Le mardi 11 septembre 2018 à 15:42 +0200,
     >     Maximilian
     >      >                     Michels a écrit :
     >      >>                     Hi Beamers,
     >      >>
     >      >>                     In the light of the discussion about Beam
     >     LTS releases, I'd like to kick
     >      >>                     off a thread about how often we
    upgrade the
     >     execution engine of each
     >      >>                     Runner. By upgrade, I mean major/minor
     >     versions which typically break
     >      >>                     the binary compatibility of Beam
    pipelines.
     >      >>
     >      >>                     For the Flink Runner, we try to track the
     >     latest stable version. Some
     >      >>                     users reported that this can be
    problematic,
     >     as it requires them to
     >      >>                     potentially upgrade their Flink
    cluster with
     >     a new version of Beam.
     >      >>
     >      >>                       From a developer's perspective, it
    makes
     >     sense to migrate as early as
     >      >>                     possible to the newest version of the
     >     execution engine, e.g. to leverage
     >      >>                     the newest features. From a user's
     >     perspective, you don't care about the
     >      >>                     latest features if your use case
    still works
     >     with Beam.
     >      >>
     >      >>                     We have to please both parties. So I'd
     >     suggest to upgrade the execution
     >      >>                     engine whenever necessary (e.g.
    critical new
     >     features, end of life of
     >      >>                     current version). On the other hand, the
     >     upcoming Beam LTS releases will
     >      >>                     contain a longer-supported version.
     >      >>
     >      >>                     Maybe we don't need to discuss much about
     >     this but I wanted to hear what
     >      >>                     the community has to say about it.
     >     Particularly, I'd be interested in
     >      >>                     how the other Runner authors intend
    to do it.
     >      >>
     >      >>                     As far as I understand, with the
    portability
     >     being stable, we could
     >      >>                     theoretically upgrade the SDK without
     >     upgrading the runtime components.
     >      >>                     That would allow us to defer the
    upgrade for
     >     a longer time.
     >      >>
     >      >>                     Best,
     >      >>                     Max
     >      >>
     >

Re: [Discuss] Upgrade story for Beam's execution engines

Reply via email to