+1 for the suggestion of Ismaël. A sample implementation can be found here: https://github.com/flavray/avro-rs. Gonna have a detail look at it :D
*One more radical idea I would like is to try is to unify a bit the implementations probably having a robust low level one in one systems language (C or Rust) and bindings for all the languages that rely on* *it but this is probably more because of my frustration of seeing projects that take this approach becoming slowly the standard and* *Apacho Avro relegated (this is already happening on the python front).* *--------------------------------------------------------------------------------------------------* *Anh Le (Andy)* *Data Engineer that loves big systems, history & entrepreneurship* *Emai* : anhl...@gmail.com *Skype* : anh_steiner *Blog* : https://bigsonata.com/ On Tue, Apr 28, 2020 at 3:01 PM Ismaël Mejía <ieme...@gmail.com> wrote: > Huge +1 to recover the Avro Enhancement Proposals (AEP) > > The experimental features Ryan mentioned definitely merit(ed) to be > part of it, and in particular the procedure to decide when they will > become ‘stable’ or default, for example for fastread. Also other > proposals/discussions like the split release or semantic versioning > should be part of it. > > About Avro 2.0.0 I think breaking binary compatibility of the format > is going to prove to be a hard sell (are named unions valuable enough > to break backwards compatibility?), if we can extend the binary format > in a compatible way there is no reason to have 2.0.0 so I agree that > there is a delicate balance we should avoid because strict stability > could let us also ostracized. > > What I personally would like is to make Avro as lean and efficient as > possible and focus mostly in the binary format part and tools probably > removing the less used parts (IPC/RPC/trevni) so it is good to see > that other people are starting to agree on that. > > One more radical idea I would like is to try is to unify a bit the > implementations probably having a robust low level one in one systems > language (C or Rust) and bindings for all the languages that rely on > it but this is probably more because of my frustration of seeing > projects that take this approach becoming slowly the standard and > Apacho Avro relegated (this is already happening on the python front). > > In general the critical issue with Avro are the downstream > consequences of our actions, and of course we will always have > incomplete information, but we can investigate and see if changes are > worth. > > Regards, > Ismaël > > On Mon, Apr 27, 2020 at 6:51 PM Ryan Skraba <r...@skraba.com> wrote: > > > > Hello! > > > > You bring up some good points -- I'm glad Avro is so widely used, but > > it does make me nervous to see any changes that might break other > > projects, or change any behaviour. > > > > Currently, we've talked about managing developer expectations with > > semantic versioning (especially with the necessary Jackson API cleanup > > that happened in 1.9.x), or versioning artifacts separately. > > > > We also have a couple of experimental/feature flags for some behaviour > > changes: > https://cwiki.apache.org/confluence/display/AVRO/Experimental+features+in+Avro > > > > And there is already a page for Avro Enhancement Proposals that look > > largely out of date: > > > https://cwiki.apache.org/confluence/display/AVRO/Avro+Enhancement+Proposals > > > > Moving some of the extras to a separate repo brings many of the same > > problems as versioning artifacts separately (nobody wants to deal with > > a compatibility matrix). I'm definitely not against it, but I'm not > > sure how it would improve the situation. > > > > There's a fine line between being extremely stable and being > > paralyzed! I would be enthusiastic about any process changes that > > would help us encourage and adopt new features (and fixes) more > > quickly. > > > > All my best, Ryan > > > > > > On Sun, Apr 26, 2020 at 11:18 AM Driesprong, Fokko <fo...@driesprong.frl> > wrote: > > > > > > Hi Andy, > > > > > > Thanks for reaching out. Sorry for not being so active in the community > > > lately. > > > > > > Since Avro 1.8.2 there has been some activity on the repository again, > > > fixing stuff like security issues and migrating to later versions of > Java. > > > Avro has been around for 10 years now, and I would like to keep (some) > > > backward compatibility to make sure that people are still going to use > it > > > for another 10 years :) In the past, the idea was to keep the format > > > backward compatibility, this excludes the Java API to. So we did some > > > changes to the API, such as removing Jackson from the public API and > > > aggressively migrating from Joda Time to Java JSR-310. This caused a > lot of > > > issues because Avro is deeply nested in a lot of projects. For > example, it > > > is a huge task to update Avro in Hive or Hadoop. Therefore we believe > that > > > backward compatibility is very important. > > > > > > And I agree that we should mainly focus on the Avro spec itself, and > not > > > too much on File I/O and Network etc :) However, if we decide to break > an > > > API, we should do it for a good reason. > > > > > > Cheers, Fokko > > > > > > Op wo 22 apr. 2020 om 16:09 schreef Andy Le <anhl...@gmail.com>: > > > > > > > Hi guys, > > > > > > > > I'm new to this vibrant open source community. My story with Avro > can be > > > > found here [1] > > > > > > > > While implementing the feature, I got stuck and had various > discussions > > > > with Dough Cutting, Fokko Driesprong.... You may see here [2] > > > > > > > > Here my (bias) observations about our current Avro 1.9.x: > > > > > > > > - Some improvements can't be made due to fear of backward > > > > incompatibilities. For example: specifications about named Union. > > > > > > > > - If `Apache Avro™ is a data serialization system.` then the > repository > > > > `apache/avro` should solely focus on (de)serialization, right? > Currently > > > > our repository contains many nice-to-have-but-not-critical things > like: > > > > File I/O, Network I/O.... > > > > > > > > IMHO, I think: > > > > > > > > - We should publicly gather RFCs for Avro 2.x > > > > > > > > - We should move such nice things out of Avro 2.x (may be to other > > > > dedicated repositories) > > > > > > > > What do you think about my suggestions. Pls kindly let me know. > > > > > > > > Thank you & be strong. > > > > > > > > [1] My fork: https://github.com/anhldbk/avro-fork#why-this-fork > > > > [2] My opened issue: > > > > > https://issues.apache.org/jira/browse/AVRO-2808?jql=reporter%3Danhldbk%20AND%20resolution%20is%20EMPTY > > > > > > > > > > > > >