Note that for the 2.0 release, we tracked this list of changes in JIRA under "backward-incompatible" labels.
https://issues.apache.org/jira/browse/BEAM-2427?jql=project%20%3D%20BEAM%20AND%20labels%20in%20(backwards-incompatible%2C%20backwards-compatibility%2C%20backward-incompatible) We could do the same leading up to an eventual sprint towards producing 3.0 On Wed, Mar 21, 2018 at 11:45 AM Robert Bradshaw <rober...@google.com> wrote: > I, too, think it's way to early to move master to 3.0.0. Especially if > this involves reworking everything from the API to runners, possibly from > scratch. Right now, I think the most important and urgent work is the many > efforts underway to fully realize the portability story. I'm also concerned > about fragmenting development effort (and the confusion for end users) if > we have a separate 2.x line. > > That being said, there's much we could cleanup and fix in Beam, and I > think it's a fine to have discussion to see what we would want to do at any > time. There have been a couple of discussions on this list, but it may be > worth summarizing these in a doc (rather than having them only be scattered > on the list). For each concrete item, it'd be worth clearly documenting > what the problem is, proposal(s) for fixing it, and why this can't be done > incrementally/in a backwards compatible way (if it can, I'd rather get > improvements in sooner and as part of a single mainline). Such a doc could > also be illuminating to why things are the way they are even when specific > changes are not pursued (e.g. are things so due to historical accident or > actual, if not obvious constraints). > > - Robert > > > > On Wed, Mar 21, 2018 at 10:57 AM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > >> >> >> Le 21 mars 2018 18:25, "Lukasz Cwik" <lc...@google.com> a écrit : >> >> I think its immature to start a new major version at this point in time. >> * Apache Beam 2.x series is less then a year old. >> * Many features that users want can be built on top of the existing APIs. >> >> >> Oki, how do you see: >> >> 1. Stripping down the classpath to beam only jars until the sdk core >> (please no jackson, no snappy, no avro, no joda to cite only the bothering >> deps) and avoid to have fat jars....really fat >> 2. Cleanup the promoted api and hide the stable but deprecated/old >> fashion ones (sdf as main extension point and hide sources for instance) >> 3. Start defining a clean lifecycle for any managed bean (linked to 2 >> which gets rid of issues with sources and sinks + serialization hooks to >> ensure sdf can be replaced for serialization/environment purposes without >> rewriting the bytecode) >> >> (Just the issues i hit today to say where they are coming from) >> >> 3 can be worked around supporting a coder on dofn and sources/sinks >> (assuming runners support it) bit 1 and 2 require a design rethought and >> erasing to be doable. >> >> >> I think the issue is you maybe see beam as part of a runtile whereas it >> is part of a library which is embedded by nature so you need to be light, >> ewtremely extensible and not define any concept not mandatory like views. >> Beam doesnt respect any of these rules today and fails at its goal of >> portable and not intrusive api for this reason IMHO. This is why I think it >> is crucial to realize it now and restart correctly. >> >> >> >> >> On Wed, Mar 21, 2018 at 1:31 AM Jean-Baptiste Onofré <j...@nanthrax.net> >> wrote: >> >>> Hi, >>> >>> Starting from scratch is an option, but don't you think it's a huge >>> effort ? >>> Anyway, we will reuse part of the existing codebase. >>> >>> Let's see what the team is thinking. >>> >>> Regards >>> JB >>> >>> On 03/21/2018 09:26 AM, Romain Manni-Bucau wrote: >>> > >>> > >>> > 2018-03-21 9:13 GMT+01:00 Jean-Baptiste Onofré <j...@nanthrax.net >>> > <mailto:j...@nanthrax.net>>: >>> > >>> > Hi Romain, >>> > >>> > We didn't define a date yet. >>> > >>> > However, I think it makes lot of sense to think about that. >>> > >>> > What about creating a beam-2.x branch and move master version to >>> > 3.0.0-SNAPSHOT ? >>> > >>> > >>> > Do we want to "move" master or start fresh to avoid to pay again the >>> legacy >>> > which prevents us to move correctly forward ATM? >>> > I really wonder if starting from scratch and only bringing stable and >>> well >>> > defined API wouldn't be saner after 10 months working with beam. >>> > >>> > Most of the actual logic will be importable easily when needed (I'm >>> thinking to >>> > the runner) but just bumping the major will keep the same pitfall in >>> the >>> > codebase which are very basic design issues IMHO. >>> > >>> > >>> > >>> > I almost agree with your point even if I would suggest you to use >>> a more >>> > positive tone: being sharp never encourage the community, >>> contribution and don't >>> > motivate people. You can say things but with some friendly form. >>> > >>> > >>> > Read it as it is "I'm tired to always workaround the API" ;). >>> Generally I send a >>> > mail after some hard fight and disappointment so mea culpa for this >>> one. >>> > >>> > >>> > >>> > I would add: >>> > >>> > - Schema or PCollection: it's already started but I think we could >>> do some >>> > improvements (potentially introducing some API change) >>> > - Hints/Annotations on PCollection: it's something we discussed >>> during Beam >>> > Summit with Tyler and others. The idea is to mimic the Message >>> Headers in Apache >>> > Camel. It would allow us to have more dynamic IOs and transforms, >>> and give some >>> > additional statements to the runners. >>> > >>> > >>> > +1 >>> > >>> > >>> > >>> > I'm proposing to start a vote to create the 2.x branch and move >>> master to Beam >>> > 3.0.0-SNAPSHOT as join effort. >>> > >>> > Regards >>> > JB >>> > >>> > On 03/21/2018 08:36 AM, Romain Manni-Bucau wrote: >>> > > Hi guys, >>> > > >>> > > it got mentionned but without any concrete dates: when beam 3 >>> work will be started? >>> > > >>> > > I'm very interested in: >>> > > >>> > > 1. reworking the whole DAG API to ensure it is instrumentable >>> (today the dag >>> > > uses a tons of static utilities and internals which makes it not >>> > > industrializable at all as soon as you are just on top of beam) >>> > > 2. reworking the API definition in its own module not coupled to >>> any >>> > > implementation details (api/provider design) and 100% based on >>> the sdf >>> > > 3. rework the overall serialization (coders + transform >>> serialization which is >>> > > hardcoded today and not portable or industrializable at all) >>> > > 4. make runners decorable properly and not just forked each time >>> you need to >>> > > modify some behavior for a particular case >>> > > (+ indeed all the issues we hit and saw on the list) >>> > > >>> > > Romain Manni-Bucau >>> > > @rmannibucau <https://twitter.com/rmannibucau >>> > <https://twitter.com/rmannibucau>> | Blog >>> > > <https://rmannibucau.metawerx.net/ < >>> https://rmannibucau.metawerx.net/>> | >>> > Old Blog >>> > > <http://rmannibucau.wordpress.com < >>> http://rmannibucau.wordpress.com>> >>> > | Github <https://github.com/rmannibucau < >>> https://github.com/rmannibucau>> | >>> > > LinkedIn <https://www.linkedin.com/in/rmannibucau >>> > <https://www.linkedin.com/in/rmannibucau>> | Book >>> > > >>> > < >>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>> > < >>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>> >> >>> > >>> > -- >>> > Jean-Baptiste Onofré >>> > jbono...@apache.org <mailto:jbono...@apache.org> >>> > http://blog.nanthrax.net >>> > Talend - http://www.talend.com >>> > >>> > >>> >>> -- >>> Jean-Baptiste Onofré >>> jbono...@apache.org >>> http://blog.nanthrax.net >>> Talend - http://www.talend.com >>> >> >>