Do we currently maintain a finer grained list of compatibility between execution/runner versions and beam versions? Is this only really a concern with recent Flink (sounded like at least Spark jump, too)? I see the capability matrix: https://beam.apache.org/documentation/runners/capability-matrix/, but some sort of compatibility between runner versions with beam releases might be useful.
I see compatibility matrix as far as beam features, but not for underlying runners. Ex: something like this would save a user trying to get Beam working on recent Flink 1.6 and then subsequently hitting a (potentially not well documented) wall given known issues. On Sun, Sep 16, 2018 at 3:59 AM Maximilian Michels <[email protected]> wrote: > > If I understand the LTS proposal correctly, then it will be a release > line that continues to receive patches (as in semantic versioning), but no > new features as that would defeat the purpose (stability). > > It matters insofar, as execution engine upgrades could be performed in > the master but the LTS version won't receive them. So LTS is the go-to > if you want to ensure compatibility with your existing setup. > > > To limit the pain of dealing with incompatible runner changes and copies > within Beam, we should probably also work with the respective community to > improve the compatibility story. > > Absolutely. If we find that we can improve compatibility with upstream > changes, we should go that path. Even if we don't have a dedicated > compatibility layer upstream yet. > > On 13.09.18 19:34, Thomas Weise wrote: > > > > On Thu, Sep 13, 2018 at 9:49 AM Maximilian Michels <[email protected] > > <mailto:[email protected]>> wrote: > > > > Thank you for your comments. Let me try to summarize what has been > > discussed so far: > > > > 1. The Beam LTS version will ensure a stable execution engine for as > > long as the LTS life span. > > > > > > If I understand the LTS proposal correctly, then it will be a release > > line that continues to receive patches (as in semantic versioning), but > > no new features as that would defeat the purpose (stability). > > > > If so, then I don't think LTS matters for this discussion. > > > > 2. We agree that pushing updates to the execution engine for the > > Runners > > is only desirable if it results in a better integration with the Beam > > model or if it is necessary due security or performance reasons. > > > > 3. We might have to consider adding additional build targets for a > > Runner for whenever the execution engine gets upgraded. This might be > > really easy if the engine's API remains stable. It might also be > > desirable if the upgrade path is not easy and not completely > > foreseeable, e.g. Etienne mentioned Spark 1.x vs Spark 2.x Runner. > The > > Beam feature set could vary depending on the version. > > > > > > To limit the pain of dealing with incompatible runner changes and copies > > within Beam, we should probably also work with the respective community > > to improve the compatibility story. > > > > > > 4. In the long run, we want a stable abstraction layer for each > Runner > > that, ideally, is maintained by the upstream of the execution > > engine. In > > the short run, this is probably not realistic, as the shared > libraries > > of Beam are not stable enough. > > > > > > Yes, that will only become an option once we reach interface stability. > > Similar to how the runner projects maintain their IO connectors. > > > > On 13.09.18 14:39, Robert Bradshaw wrote: > > > The ideal long-term solution is, as Romain mentions, pushing the > > > runner-specific code up to be maintained by each runner with a > > stable > > > API to use to talk to Beam. Unfortunately, I think we're still a > > long > > > way from having this Stable API, or having the clout for > > > non-beam-developers to maintain these bindings externally (though > > > hopefully we'll get there). > > > > > > In the short term, we're stuck with either hurting users that > > want to > > > stick with Flink 1.5, hurting users that want to upgrade to Flink > > 1.6, > > > or supporting both. Is Beam's interaction with Flink such that we > > can't > > > simply have separate targets linking the same Beam code against > > one or > > > the other? (I.e. are code changes needed?) If so, we'll probably > > need a > > > flink-runner-1.5 module, a flink-runner-1.6, and a > > flink-runner-common > > > module. Or we hope that all users are happy with 1.5 until a > certain > > > point in time when they all want to simultaneously jump to 1.6 > > and Beam > > > at the same time. Maybe that's enough in the short term, but > > longer term > > > we need a more sustainable solution. > > > > > > > > > On Thu, Sep 13, 2018 at 7:13 AM Romain Manni-Bucau > > > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > > > Hi guys, > > > > > > Isnt the issue "only" that beam has this code instead of > engines? > > > > > > Assuming beam runner facing api is stable - which must be the > > case > > > anyway - and that each engine has its integration (flink-beam > > > instead of beam-runners-flink), then this issue disappears by > > > construction. > > > > > > It also has the advantage to have a better maintenance. > > > > > > Side note: this is what happent which arquillian, originally > the > > > community did all adapters impl then each vendor took it back > in > > > house to make it better. > > > > > > Any way to work in that direction maybe? > > > > > > Le jeu. 13 sept. 2018 00:49, Thomas Weise <[email protected] > > <mailto:[email protected]> > > > <mailto:[email protected] <mailto:[email protected]>>> a écrit : > > > > > > The main problem here is that users are forced to upgrade > > > infrastructure to obtain new features in Beam, even when > > those > > > features actually don't require such changes. As an > example, > > > another update to Flink 1.6.0 was proposed (without > > supporting > > > new functionality in Beam) and we already know that it > breaks > > > compatibility (again). > > > > > > I think that upgrading to a Flink X.Y.0 version isn't a > good > > > idea to start with. But besides that, if we want to grow > > > adoption, then we need to focus on stability and > delivering > > > improvements to Beam without disrupting users. > > > > > > In the specific case, ideally the surface of Flink would > be > > > backward compatible, allowing us to stick to a minimum > > version > > > and be able to submit pipelines to Flink endpoints of > higher > > > versions. Some work in that direction is underway (like > > > versioning the REST API). FYI, lowest common version is > what > > > most projects that depend on Hadoop 2.x follow. > > > > > > Since Beam with Flink 1.5.x client won't talk to Flink > > 1.6 and > > > there are code changes required to make it compile, we > would > > > need to come up with a more involved strategy to support > > > multiple Flink versions. Till then, I would prefer we > favor > > > existing users over short lived experiments, which would > mean > > > stick with 1.5.x and not support 1.6.0. > > > > > > Thanks, > > > Thomas > > > > > > > > > On Wed, Sep 12, 2018 at 1:15 PM Lukasz Cwik > > <[email protected] <mailto:[email protected]> > > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > > > > > As others have already suggested, I also believe LTS > > > releases is the best we can do as a community right > now > > > until portability allows us to decouple what a user > > writes > > > with and how it runs (the SDK and the SDK > > environment) from > > > the runner (job service + shared common runner libs + > > > Flink/Spark/Dataflow/Apex/Samza/...). > > > > > > Dataflow would be highly invested in having the > > appropriate > > > tooling within Apache Beam to support multiple SDK > > versions > > > against a runner. This in turn would allow people to > > use any > > > SDK with any runner and as Robert had mentioned, > certain > > > optimizations and features would be disabled > depending on > > > the capabilities of the runner and the capabilities > > of the SDK. > > > > > > > > > > > > On Wed, Sep 12, 2018 at 6:38 AM Robert Bradshaw > > > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > > > The target audience is people who want to use the > > latest > > > Beam but do not want to use the latest version of > the > > > runner, right? > > > > > > I think this will be somewhat (though not > entirely) > > > addressed by Beam LTS releases, where those not > > wanting > > > to upgrade the runner at least have a > well-supported > > > version of Beam. In the long term, we have the > > division > > > > > > Runner <-> BeamRunnerSpecificCode <-> > > > CommonBeamRunnerLibs <-> SDK. > > > > > > (which applies to the job submission as well as > > execution). > > > > > > Insomuch as the BeamRunnerSpecificCode uses the > > public > > > APIs of the runner, hopefully upgrading the > > runner for > > > minor versions should be a no-op, and we can > > target the > > > lowest version of the runner that makes sense, > > allowing > > > the user to link against higher versions at his > > or her > > > discretion. We should provide built targets that > > allow > > > this. For major versions, it may make sense to > > have two > > > distinct BeamRunnerSpecificCode libraries (which > > may or > > > may not share some common code). I hope these > > wrappers > > > are not too thick. > > > > > > There is a tight coupling at > > the BeamRunnerSpecificCode > > > <-> CommonBeamRunnerLibs layer, but hopefully the > > bulk > > > of the code lives on the right hand side and can > be > > > updated as needed independent of the runner. > > There may > > > be code of the form "if the runner supports X, do > > this > > > fast path, otherwise, do this slow path (or > > reject the > > > pipeline). > > > > > > I hope the CommonBeamRunnerLibs <-> SDK coupling > is > > > fairly loose, to the point that one could use > > SDKs from > > > different versions of Beam (or even developed > > outside of > > > Beam) with an older/newer runner. We may need to > add > > > versioning to the Fn/Runner/Job API itself to > support > > > this. Right now of course we're still in a > pre-1.0, > > > rapid-development phase wrt this API. > > > > > > > > > > > > > > > On Wed, Sep 12, 2018 at 2:10 PM Etienne Chauchot > > > <[email protected] > > <mailto:[email protected]> <mailto:[email protected] > > <mailto:[email protected]>>> wrote: > > > > > > Hi Max, > > > > > > I totally agree with your points especially > the > > > users priorities (stick to the already working > > > version) , and the need to leverage important > new > > > features. It is indeed a difficult balance to > > find . > > > > > > I can talk for a part I know: for the Spark > > runner, > > > the aim was to support Dataset native spark > > API (in > > > place of RDD). For that we needed to upgrade > to > > > spark 2.x (and we will probably leverage Beam > > Row as > > > well). > > > But such an upgrade is a good amount of work > > which > > > makes it difficult to commit on a schedule > > such as > > > "if there is a major new feature on an > execution > > > engine that we want to leverage, then the > > upgrade in > > > Beam will be done within x months". > > > > > > Regarding your point on portability : > > decoupling SDK > > > from runner with runner harness and SDK > harness > > > might make pipeline authors work easy > regarding > > > pipeline maintenance. But, still, if we > upgrade > > > runner libs, then the users might have their > > runner > > > harness not work with their engine version. > > > If such SDK/runner decoupling is 100% > functional, > > > then we could imaging having multiple runner > > > harnesses shipping different versions of the > > runner > > > libs to solve this problem. > > > But we would need to support more than one > > version > > > of the runner libs. We chose not to do this > > on spark > > > runner. > > > > > > WDYT ? > > > > > > Best > > > Etienne > > > > > > > > > Le mardi 11 septembre 2018 à 15:42 +0200, > > Maximilian > > > Michels a écrit : > > >> Hi Beamers, > > >> > > >> In the light of the discussion about Beam > > LTS releases, I'd like to kick > > >> off a thread about how often we upgrade the > > execution engine of each > > >> Runner. By upgrade, I mean major/minor > > versions which typically break > > >> the binary compatibility of Beam pipelines. > > >> > > >> For the Flink Runner, we try to track the > > latest stable version. Some > > >> users reported that this can be problematic, > > as it requires them to > > >> potentially upgrade their Flink cluster with > > a new version of Beam. > > >> > > >> From a developer's perspective, it makes > > sense to migrate as early as > > >> possible to the newest version of the > > execution engine, e.g. to leverage > > >> the newest features. From a user's > > perspective, you don't care about the > > >> latest features if your use case still works > > with Beam. > > >> > > >> We have to please both parties. So I'd > > suggest to upgrade the execution > > >> engine whenever necessary (e.g. critical new > > features, end of life of > > >> current version). On the other hand, the > > upcoming Beam LTS releases will > > >> contain a longer-supported version. > > >> > > >> Maybe we don't need to discuss much about > > this but I wanted to hear what > > >> the community has to say about it. > > Particularly, I'd be interested in > > >> how the other Runner authors intend to do it. > > >> > > >> As far as I understand, with the portability > > being stable, we could > > >> theoretically upgrade the SDK without > > upgrading the runtime components. > > >> That would allow us to defer the upgrade for > > a longer time. > > >> > > >> Best, > > >> Max > > >> > > >
