+1 for the suggestion of Ismaël. A sample implementation can be found here:
https://github.com/flavray/avro-rs. Gonna have a detail look at it :D




*One more radical idea I would like is to try is to unify a bit the
implementations probably having a robust low level one in one systems
language (C or Rust) and bindings for all the languages that rely on*

*it but this is probably more because of my frustration of seeing projects
that take this approach becoming slowly the standard and* *Apacho Avro
relegated (this is already happening on the python front).*


*--------------------------------------------------------------------------------------------------*
*Anh Le (Andy)*

*Data Engineer that loves big systems, history & entrepreneurship*

*Emai*     : anhl...@gmail.com
*Skype*   : anh_steiner
*Blog*      : https://bigsonata.com/


On Tue, Apr 28, 2020 at 3:01 PM Ismaël Mejía <ieme...@gmail.com> wrote:

> Huge +1 to recover the Avro Enhancement Proposals (AEP)
>
> The experimental features Ryan mentioned definitely merit(ed) to be
> part of it, and in particular the procedure to decide when they will
> become ‘stable’ or default, for example for fastread. Also other
> proposals/discussions like the split release or semantic versioning
> should be part of it.
>
> About Avro 2.0.0 I think breaking binary compatibility of the format
> is going to prove to be a hard sell (are named unions valuable enough
> to break backwards compatibility?), if we can extend the binary format
> in a compatible way there is no reason to have 2.0.0 so I agree that
> there is a delicate balance we should avoid because strict stability
> could let us also ostracized.
>
> What I personally would like is to make Avro as lean and efficient as
> possible and focus mostly in the binary format part and tools probably
> removing the less used parts (IPC/RPC/trevni) so it is good to see
> that other people are starting to agree on that.
>
> One more radical idea I would like is to try is to unify a bit the
> implementations probably having a robust low level one in one systems
> language (C or Rust) and bindings for all the languages that rely on
> it but this is probably more because of my frustration of seeing
> projects that take this approach becoming slowly the standard and
> Apacho Avro relegated (this is already happening on the python front).
>
> In general the critical issue with Avro are the downstream
> consequences of our actions, and of course we will always have
> incomplete information, but we can investigate and see if changes are
> worth.
>
> Regards,
> Ismaël
>
> On Mon, Apr 27, 2020 at 6:51 PM Ryan Skraba <r...@skraba.com> wrote:
> >
> > Hello!
> >
> > You bring up some good points -- I'm glad Avro is so widely used, but
> > it does make me nervous to see any changes that might break other
> > projects, or change any behaviour.
> >
> > Currently, we've talked about managing developer expectations with
> > semantic versioning (especially with the necessary Jackson API cleanup
> > that happened in 1.9.x), or versioning artifacts separately.
> >
> > We also have a couple of experimental/feature flags for some behaviour
> > changes:
> https://cwiki.apache.org/confluence/display/AVRO/Experimental+features+in+Avro
> >
> > And there is already a page for Avro Enhancement Proposals that look
> > largely out of date:
> >
> https://cwiki.apache.org/confluence/display/AVRO/Avro+Enhancement+Proposals
> >
> > Moving some of the extras to a separate repo brings many of the same
> > problems as versioning artifacts separately (nobody wants to deal with
> > a compatibility matrix).  I'm definitely not against it, but I'm not
> > sure how it would improve the situation.
> >
> > There's a fine line between being extremely stable and being
> > paralyzed! I would be enthusiastic about any process changes that
> > would help us encourage and adopt new features (and fixes) more
> > quickly.
> >
> > All my best, Ryan
> >
> >
> > On Sun, Apr 26, 2020 at 11:18 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
> > >
> > > Hi Andy,
> > >
> > > Thanks for reaching out. Sorry for not being so active in the community
> > > lately.
> > >
> > > Since Avro 1.8.2 there has been some activity on the repository again,
> > > fixing stuff like security issues and migrating to later versions of
> Java.
> > > Avro has been around for 10 years now, and I would like to keep (some)
> > > backward compatibility to make sure that people are still going to use
> it
> > > for another 10 years :) In the past, the idea was to keep the format
> > > backward compatibility, this excludes the Java API to. So we did some
> > > changes to the API, such as removing Jackson from the public API and
> > > aggressively migrating from Joda Time to Java JSR-310. This caused a
> lot of
> > > issues because Avro is deeply nested in a lot of projects. For
> example, it
> > > is a huge task to update Avro in Hive or Hadoop. Therefore we believe
> that
> > > backward compatibility is very important.
> > >
> > > And I agree that we should mainly focus on the Avro spec itself, and
> not
> > > too much on File I/O and Network etc :) However, if we decide to break
> an
> > > API, we should do it for a good reason.
> > >
> > > Cheers, Fokko
> > >
> > > Op wo 22 apr. 2020 om 16:09 schreef Andy Le <anhl...@gmail.com>:
> > >
> > > > Hi guys,
> > > >
> > > > I'm new to this vibrant open source community. My story with Avro
> can be
> > > > found here [1]
> > > >
> > > > While implementing the feature, I got stuck and had various
> discussions
> > > > with Dough Cutting, Fokko Driesprong.... You may see here [2]
> > > >
> > > > Here my (bias) observations about our current Avro 1.9.x:
> > > >
> > > > - Some improvements can't be made due to fear of backward
> > > > incompatibilities. For example: specifications about named Union.
> > > >
> > > > - If `Apache Avro™ is a data serialization system.` then the
> repository
> > > > `apache/avro` should solely focus on (de)serialization, right?
> Currently
> > > > our repository contains many nice-to-have-but-not-critical things
> like:
> > > > File I/O, Network I/O....
> > > >
> > > > IMHO, I think:
> > > >
> > > > - We should publicly gather RFCs for Avro 2.x
> > > >
> > > > - We should move such nice things out of Avro 2.x (may be to other
> > > > dedicated repositories)
> > > >
> > > > What do you think about my suggestions. Pls kindly let me know.
> > > >
> > > > Thank you & be strong.
> > > >
> > > > [1] My fork: https://github.com/anhldbk/avro-fork#why-this-fork
> > > > [2] My opened issue:
> > > >
> https://issues.apache.org/jira/browse/AVRO-2808?jql=reporter%3Danhldbk%20AND%20resolution%20is%20EMPTY
> > > >
> > > >
> > > >
>

Reply via email to