Seems fine to me. At some point we might want to do an audit of existing
Jira issues, because I suspect there are issues that should be targeted to
3.0 but are not yet tagged.

On Sun, Feb 4, 2018 at 11:41 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> I would prefer to use Jira, with "wish"/"ideas", and adding Beam 3.0.0
> version.
>
> WDYT ?
>
> Regards
> JB
>
> On 02/04/2018 07:55 PM, Reuven Lax wrote:
> > Do we have a good place to track the items for Beam 3.0, or is Jira the
> best
> > place? Romain has a good point - if this gets forgotten when we do Beam
> 3.0,
> > then we're stuck waiting around till Beam 4.0.
> >
> > Reuven
> >
> > On Sun, Feb 4, 2018 at 9:27 AM, Jean-Baptiste Onofré <j...@nanthrax.net
> > <mailto:j...@nanthrax.net>> wrote:
> >
> >     That's a good point. In the roadmap for Beam 3, I think it makes
> sense to add a
> >     point about this.
> >
> >     Regards
> >     JB
> >
> >     On 02/04/2018 06:18 PM, Eugene Kirpichov wrote:
> >     > I think doing a change that would break pipeline update for every
> single user of
> >     > Flink and Dataflow needs to be postponed until a next major
> version. Pipeline
> >     > update is a very frequently used feature, especially by the
> largest users. We've
> >     > had those users get significantly upset even when we accidentally
> broke update
> >     > compatibility for some special cases of individual transforms;
> breaking it
> >     > intentionally and project-wide is too extreme to be justified by
> the benefits of
> >     > the current change.
> >     >
> >     > That said, I think concerns about coder APIs are reasonable, and
> it is
> >     > unfortunate that we effectively can't make changes to them right
> now. It would
> >     > be great if in the next major version we were better prepared for
> evolution of
> >     > coders, e.g. by having coders support a version marker or
> something like that,
> >     > with an API for detecting the version of data on wire and reading
> or writing
> >     > data of an old version. Such a change (introducing versioning)
> would also, of
> >     > course, be incompatible and would need to be postponed until a
> major version -
> >     > but, at least, subsequent changes wouldn't.
> >     >
> >     > ...And as I was typing this email, seems that this is what the
> thread already
> >     > came to!
> >     >
> >     > On Sun, Feb 4, 2018 at 9:16 AM Romain Manni-Bucau <
> rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>
> >     > <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>>>
> wrote:
> >     >
> >     >     I like this idea of migration support at coder level. It would
> require to
> >     >     add a metadata in all outputs which would represent the
> version then coders
> >     >     can handle the logic properly depending the version - we can
> assume a coder
> >     >     dev upgrade the version when he breaks the representation I
> hope ;).
> >     >     With this: no runner impact at all :).
> >     >
> >     >
> >     >     Romain Manni-Bucau
> >     >     @rmannibucau <https://twitter.com/rmannibucau
> >     <https://twitter.com/rmannibucau>> |  Blog
> >     >     <https://rmannibucau.metawerx.net/
> >     <https://rmannibucau.metawerx.net/>> | Old Blog
> >     >     <http://rmannibucau.wordpress.com <
> http://rmannibucau.wordpress.com>>
> >     | Github
> >     >     <https://github.com/rmannibucau <https://github.com/
> rmannibucau>> |
> >     LinkedIn
> >     >     <https://www.linkedin.com/in/rmannibucau
> >     <https://www.linkedin.com/in/rmannibucau>> | Book
> >     >
> >      <https://www.packtpub.com/application-development/java-
> ee-8-high-performance <https://www.packtpub.com/
> application-development/java-ee-8-high-performance>>
> >     >
> >     >     2018-02-04 18:09 GMT+01:00 Reuven Lax <re...@google.com
> <mailto:re...@google.com>
> >     >     <mailto:re...@google.com <mailto:re...@google.com>>>:
> >     >
> >     >         It would already break quite a number of users at this
> point.
> >     >
> >     >         I think what we should be doing is moving forward on the
> snapshot/update
> >     >         proposal. That proposal actually provides a way forward
> when coders
> >     >         change (it proposes a way to map an old snapshot to one
> using the new
> >     >         coder, so changes to coders in the future will be much
> easier to make.
> >     >         However much of the implementation for this will likely be
> at the runner
> >     >         level, not the SDK level.
> >     >
> >     >         Reuven
> >     >
> >     >         On Sun, Feb 4, 2018 at 9:04 AM, Romain Manni-Bucau
> >     >         <rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>
> >     <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>>>
> wrote:
> >     >
> >     >             I fully understand that, and this is one of the reason
> managing to
> >     >             solve these issues is very important and ASAP. My
> conclusion is that
> >     >             we must break it now to avoid to do it later when
> usage will be way
> >     >             more developped - I would be very happy to be wrong on
> that point -
> >     >             so I started this PR and this thread. We can postpone
> it but it
> >     >             would break later so for probably more users.
> >     >
> >     >
> >     >             Romain Manni-Bucau
> >     >             @rmannibucau <https://twitter.com/rmannibucau
> >     <https://twitter.com/rmannibucau>> |  Blog
> >     >             <https://rmannibucau.metawerx.net/
> >     <https://rmannibucau.metawerx.net/>> | Old Blog
> >     >             <http://rmannibucau.wordpress.com
> >     <http://rmannibucau.wordpress.com>> | Github
> >     >             <https://github.com/rmannibucau
> >     <https://github.com/rmannibucau>> | LinkedIn
> >     >             <https://www.linkedin.com/in/rmannibucau
> >     <https://www.linkedin.com/in/rmannibucau>> | Book
> >     >
> >      <https://www.packtpub.com/application-development/java-
> ee-8-high-performance <https://www.packtpub.com/
> application-development/java-ee-8-high-performance>>
> >     >
> >     >             2018-02-04 17:49 GMT+01:00 Reuven Lax <
> re...@google.com <mailto:re...@google.com>
> >     >             <mailto:re...@google.com <mailto:re...@google.com>>>:
> >     >
> >     >                 Unfortunately several runners (at least Flink and
> Dataflow)
> >     >                 support in-place update of streaming pipelines as
> a key feature,
> >     >                 and changing coder format breaks this. This is a
> very important
> >     >                 feature of both runners, and we should endeavor
> not to break them.
> >     >
> >     >                 In-place snapshot and update is also a top-level
> Beam proposal
> >     >                 that was received positively, though neither of
> those runners
> >     >                 yet implement the proposed interface.
> >     >
> >     >                 Reuven
> >     >
> >     >                 On Sun, Feb 4, 2018 at 8:44 AM, Romain Manni-Bucau
> >     >                 <rmannibu...@gmail.com <mailto:
> rmannibu...@gmail.com>
> >     <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>>>
> wrote:
> >     >
> >     >                     Sadly yes, and why the PR is actually WIP. As
> mentionned it
> >     >                     modifies it and requires some updates in other
> languages and
> >     >                     the standard_coders.yml file (I didn't find
> how this file
> >     >                     was generated).
> >     >                     Since coders must be about volatile data I
> don't think it is
> >     >                     a big deal to change it though.
> >     >
> >     >
> >     >                     Romain Manni-Bucau
> >     >                     @rmannibucau <https://twitter.com/rmannibucau
> >     <https://twitter.com/rmannibucau>> |  Blog
> >     >                     <https://rmannibucau.metawerx.net/
> >     <https://rmannibucau.metawerx.net/>> | Old Blog
> >     >                     <http://rmannibucau.wordpress.com
> >     <http://rmannibucau.wordpress.com>> | Github
> >     >                     <https://github.com/rmannibucau
> >     <https://github.com/rmannibucau>> | LinkedIn
> >     >                     <https://www.linkedin.com/in/rmannibucau
> >     <https://www.linkedin.com/in/rmannibucau>> | Book
> >     >
> >      <https://www.packtpub.com/application-development/java-
> ee-8-high-performance <https://www.packtpub.com/
> application-development/java-ee-8-high-performance>>
> >     >
> >     >                     2018-02-04 17:34 GMT+01:00 Reuven Lax <
> re...@google.com <mailto:re...@google.com>
> >     >                     <mailto:re...@google.com <mailto:
> re...@google.com>>>:
> >     >
> >     >                         One question - does this change the actual
> byte encoding
> >     >                         of elements? We've tried hard not to do
> that so far for
> >     >                         reasons of compatibility.
> >     >
> >     >                         Reuven
> >     >
> >     >                         On Sun, Feb 4, 2018 at 6:44 AM, Romain
> Manni-Bucau
> >     >                         <rmannibu...@gmail.com
> >     <mailto:rmannibu...@gmail.com> <mailto:rmannibu...@gmail.com
> >     <mailto:rmannibu...@gmail.com>>>
> >     >                         wrote:
> >     >
> >     >                             Hi guys,
> >     >
> >     >                             I submitted a PR on coders to enhance
> 1. the user
> >     >                             experience 2. the determinism and
> handling of
> >     coders.
> >     >
> >     >                             1. the user experience is linked to
> what i
> >     sent some
> >     >                             days ago: close handling of the
> streams from a
> >     coder
> >     >                             code. Long story short I add a
> SkipCloseCoder
> >     which
> >     >                             can decorate a coder and just wraps
> the stream
> >     >                             (input or output) in flavors skipping
> close()
> >     calls.
> >     >                             This avoids to do it by default (which
> had my
> >     >                             preference if you read the related
> thread but not
> >     >                             the one of everybody) but also makes
> the usage
> >     of a
> >     >                             coder with this issue easy since the
> of() of the
> >     >                             coder just wraps itself in this
> delagating coder.
> >     >
> >     >                             2. this one is more nasty and mainly
> concerns
> >     >                             IterableLikeCoders. These ones use
> this kind of
> >     >                             algorithm (keep in mind they work on a
> list):
> >     >
> >     >                             writeSize()
> >     >                             for all element e {
> >     >                                 elementCoder.write(e)
> >     >                             }
> >     >                             writeMagicNumber() // this one depends
> the size
> >     >
> >     >                             The decoding is symmetric so I bypass
> it here.
> >     >
> >     >                             Indeed all these writes (reads) are
> done on
> >     the same
> >     >                             stream. Therefore it assumes you read
> as much
> >     bytes
> >     >                             than you write...which is a huge
> assumption for a
> >     >                             coder which should by contract assume
> it can read
> >     >                             the stream...as a stream (until -1).
> >     >
> >     >                             The idea of the fix is to change this
> encoding to
> >     >                             this kind of algorithm:
> >     >
> >     >                             writeSize()
> >     >                             for all element e {
> >     >                                 writeElementByteCount(e)
> >     >                                 elementCoder.write(e)
> >     >                             }
> >     >                             writeMagicNumber() // still optionally
> >     >
> >     >                             This way on the decode size you can
> wrap the
> >     stream
> >     >                             by element to enforce the limitation
> of the
> >     byte count.
> >     >
> >     >                             Side note: this indeed enforce a
> limitation due to
> >     >                             java byte limitation but if you check
> coder
> >     code it
> >     >                             is already here at the higher level so
> it is not a
> >     >                             big deal for now.
> >     >
> >     >                             In terms of implementation it uses a
> >     >                             LengthAwareCoder which delegates to
> another coder
> >     >                             the encoding and just adds the byte
> count
> >     before the
> >     >                             actual serialization. Not perfect but
> should
> >     be more
> >     >                             than enough in terms of support and
> perf for
> >     beam if
> >     >                             you think real pipelines (we try to
> avoid
> >     >                             serializations or it is done on some
> well known
> >     >                             points where this algo should be
> >     enough...worse case
> >     >                             it is not a huge overhead, mainly just
> some memory
> >     >                             overhead).
> >     >
> >     >
> >     >                             The PR is available
> >     >                             at https://github.com/apache/
> beam/pull/4594
> >     <https://github.com/apache/beam/pull/4594>. If you
> >     >                             check you will see I put it "WIP". The
> main reason
> >     >                             is that it changes the encoding format
> for
> >     >                             containers (lists, iterable, ...) and
> therefore
> >     >                             breaks python/go/... tests and the
> >     >                             standard_coders.yml definition. Some
> help on that
> >     >                             would be very welcomed.
> >     >
> >     >                             Technical side note if you
> >     >                             wonder: UnownedInputStream doesn't
> even allow to
> >     >                             mark the stream so there is no real
> fast way
> >     to read
> >     >                             the stream as fast as possible with
> standard
> >     >                             buffering strategies and to support
> this automatic
> >     >                             IterableCoder wrapping which is
> implicit. In other
> >     >                             words, if beam wants to support any
> coder,
> >     including
> >     >                             the ones not requiring to write the
> size of the
> >     >                             output - most of the codecs - then we
> need to
> >     change
> >     >                             the way it works to something like
> that which does
> >     >                             it for the user which doesn't know its
> coder got
> >     >                             wrapped.
> >     >
> >     >                             Hope it makes sense, if not, don't
> hesitate to ask
> >     >                             questions.
> >     >
> >     >                             Happy end of week-end.
> >     >
> >     >                             Romain Manni-Bucau
> >     >                             @rmannibucau <https://twitter.com/
> rmannibucau
> >     <https://twitter.com/rmannibucau>> |
> >     >                              Blog <https://rmannibucau.metawerx.
> net/
> >     <https://rmannibucau.metawerx.net/>> | Old Blog
> >     >                             <http://rmannibucau.wordpress.com
> >     <http://rmannibucau.wordpress.com>> | Github
> >     >                             <https://github.com/rmannibucau
> >     <https://github.com/rmannibucau>> | LinkedIn
> >     >                             <https://www.linkedin.com/in/
> rmannibucau
> >     <https://www.linkedin.com/in/rmannibucau>> | Book
> >     >
> >      <https://www.packtpub.com/application-development/java-
> ee-8-high-performance <https://www.packtpub.com/
> application-development/java-ee-8-high-performance>>
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >
> >
> >     --
> >     Jean-Baptiste Onofré
> >     jbono...@apache.org <mailto:jbono...@apache.org>
> >     http://blog.nanthrax.net
> >     Talend - http://www.talend.com
> >
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to