I, too, think it's way to early to move master to 3.0.0. Especially if this
involves reworking everything from the API to runners, possibly from
scratch. Right now, I think the most important and urgent work is the many
efforts underway to fully realize the portability story. I'm also concerned
about fragmenting development effort (and the confusion for end users) if
we have a separate 2.x line.

That being said, there's much we could cleanup and fix in Beam, and I think
it's a fine to have discussion to see what we would want to do at any time.
There have been a couple of discussions on this list, but it may be worth
summarizing these in a doc (rather than having them only be scattered on
the list). For each concrete item, it'd be worth clearly documenting what
the problem is, proposal(s) for fixing it, and why this can't be done
incrementally/in a backwards compatible way (if it can, I'd rather get
improvements in sooner and as part of a single mainline). Such a doc could
also be illuminating to why things are the way they are even when specific
changes are not pursued (e.g. are things so due to historical accident or
actual, if not obvious constraints).

- Robert



On Wed, Mar 21, 2018 at 10:57 AM Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

>
>
> Le 21 mars 2018 18:25, "Lukasz Cwik" <lc...@google.com> a écrit :
>
> I think its immature to start a new major version at this point in time.
> * Apache Beam 2.x series is less then a year old.
> * Many features that users want can be built on top of the existing APIs.
>
>
> Oki, how do you see:
>
> 1. Stripping down the classpath to beam only jars until the sdk core
> (please no jackson, no snappy, no avro, no joda to cite only the bothering
> deps) and avoid to have fat jars....really fat
> 2. Cleanup the promoted api and hide the stable but deprecated/old fashion
> ones (sdf as main extension point and hide sources for instance)
> 3. Start defining a clean lifecycle for any managed bean (linked to 2
> which gets rid of issues with sources and sinks + serialization hooks to
> ensure sdf can be replaced for serialization/environment purposes without
> rewriting the bytecode)
>
> (Just the issues i hit today to say where they are coming from)
>
> 3 can be worked around supporting a coder on dofn and sources/sinks
> (assuming runners support it) bit 1 and 2 require a design rethought and
> erasing to be doable.
>
>
> I think the issue is you maybe see beam as part of a runtile whereas it is
> part of a library which is embedded by nature so you need to be light,
> ewtremely extensible and not define any concept not mandatory like views.
> Beam doesnt respect any of these rules today and fails at its goal of
> portable and not intrusive api for this reason IMHO. This is why I think it
> is crucial to realize it now and restart correctly.
>
>
>
>
> On Wed, Mar 21, 2018 at 1:31 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi,
>>
>> Starting from scratch is an option, but don't you think it's a huge
>> effort ?
>> Anyway, we will reuse part of the existing codebase.
>>
>> Let's see what the team is thinking.
>>
>> Regards
>> JB
>>
>> On 03/21/2018 09:26 AM, Romain Manni-Bucau wrote:
>> >
>> >
>> > 2018-03-21 9:13 GMT+01:00 Jean-Baptiste Onofré <j...@nanthrax.net
>> > <mailto:j...@nanthrax.net>>:
>> >
>> >     Hi Romain,
>> >
>> >     We didn't define a date yet.
>> >
>> >     However, I think it makes lot of sense to think about that.
>> >
>> >     What about creating a beam-2.x branch and move master version to
>> >     3.0.0-SNAPSHOT ?
>> >
>> >
>> > Do we want to "move" master or start fresh to avoid to pay again the
>> legacy
>> > which prevents us to move correctly forward ATM?
>> > I really wonder if starting from scratch and only bringing stable and
>> well
>> > defined API wouldn't be saner after 10 months working with beam.
>> >
>> > Most of the actual logic will be importable easily when needed (I'm
>> thinking to
>> > the runner) but just bumping the major will keep the same pitfall in the
>> > codebase which are very basic design issues IMHO.
>> >
>> >
>> >
>> >     I almost agree with your point even if I would suggest you to use a
>> more
>> >     positive tone: being sharp never encourage the community,
>> contribution and don't
>> >     motivate people. You can say things but with some friendly form.
>> >
>> >
>> > Read it as it is "I'm tired to always workaround the API" ;). Generally
>> I send a
>> > mail after some hard fight and disappointment so mea culpa for this one.
>> >
>> >
>> >
>> >     I would add:
>> >
>> >     - Schema or PCollection: it's already started but I think we could
>> do some
>> >     improvements (potentially introducing some API change)
>> >     - Hints/Annotations on PCollection: it's something we discussed
>> during Beam
>> >     Summit with Tyler and others. The idea is to mimic the Message
>> Headers in Apache
>> >     Camel. It would allow us to have more dynamic IOs and transforms,
>> and give some
>> >     additional statements to the runners.
>> >
>> >
>> > +1
>> >
>> >
>> >
>> >     I'm proposing to start a vote to create the 2.x branch and move
>> master to Beam
>> >     3.0.0-SNAPSHOT as join effort.
>> >
>> >     Regards
>> >     JB
>> >
>> >     On 03/21/2018 08:36 AM, Romain Manni-Bucau wrote:
>> >     > Hi guys,
>> >     >
>> >     > it got mentionned but without any concrete dates: when beam 3
>> work will be started?
>> >     >
>> >     > I'm very interested in:
>> >     >
>> >     > 1. reworking the whole DAG API to ensure it is instrumentable
>> (today the dag
>> >     > uses a tons of static utilities and internals which makes it not
>> >     > industrializable at all as soon as you are just on top of beam)
>> >     > 2. reworking the API definition in its own module not coupled to
>> any
>> >     > implementation details (api/provider design) and 100% based on
>> the sdf
>> >     > 3. rework the overall serialization (coders + transform
>> serialization which is
>> >     > hardcoded today and not portable or industrializable at all)
>> >     > 4. make runners decorable properly and not just forked each time
>> you need to
>> >     > modify some behavior for a particular case
>> >     > (+ indeed all the issues we hit and saw on the list)
>> >     >
>> >     > Romain Manni-Bucau
>> >     > @rmannibucau <https://twitter.com/rmannibucau
>> >     <https://twitter.com/rmannibucau>> |  Blog
>> >     > <https://rmannibucau.metawerx.net/ <
>> https://rmannibucau.metawerx.net/>> |
>> >     Old Blog
>> >     > <http://rmannibucau.wordpress.com <
>> http://rmannibucau.wordpress.com>>
>> >     | Github <https://github.com/rmannibucau <
>> https://github.com/rmannibucau>> |
>> >     > LinkedIn <https://www.linkedin.com/in/rmannibucau
>> >     <https://www.linkedin.com/in/rmannibucau>> | Book
>> >     >
>> >     <
>> https://www.packtpub.com/application-development/java-ee-8-high-performance
>> >     <
>> https://www.packtpub.com/application-development/java-ee-8-high-performance
>> >>
>> >
>> >     --
>> >     Jean-Baptiste Onofré
>> >     jbono...@apache.org <mailto:jbono...@apache.org>
>> >     http://blog.nanthrax.net
>> >     Talend - http://www.talend.com
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbono...@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>
>

Reply via email to