Regarding Google's advice about shading: don't go to a "one version rule" monorepo for advice about solving diamond dependencies in the wild.
It is a useful description of the pitfalls. We (and Flink before us, and likely many more) are already doing something that avoids many of them or makes them less likely. Building a separate vendored library is simpler and more robust than the "shade during build". Ismaël's point #2 is important: we can't shade or vendor Avro if we intend to use it with user-generated code. Generated code with external dependencies requires coordination with the vendoring, as we do for portability + gRPC. The ProtoCoder uses non-vendored proto for this reason. The one totally internal use of Avro I am aware of is BigQueryIO. This could perhaps use a vendored Avro. But OTOH it is already in an isolated module so it is less severe. And as many have pointed out, upgrading a dep across a breaking change is a breaking change. "Stop depending on Avro" is a breaking change as well. So if we are going to do that, moving it out of core is a more valuable breaking change. But perhaps highlighting Gleb's comment: we can build a separate library/artifacts for providing an AvroCoder that uses Avro 1.9.x (and potentially make a separate one for 1.8.x and encourage users to use that). We might be able to make Avro 1.8.x optional for the core SDK, finding a way for a user to pin to 1.9 as long as they don't touch the parts of the SDK that use 1.8. Kenn On Thu, Jan 16, 2020 at 1:49 PM Aaron Dixon <atdi...@gmail.com> wrote: > Looks like there's some strategy to get to the right solution here and > that it may likely involve breaking compatibility. > > One option for myself would be to strip the Beam JAR of AvroCoder and > combine with the old AvroCoder from Beam 2.16 -- this would allow me to > upgrade Beam but of course is rather hacky. > > On second thought, was the breaking change from Beam 2.16->2.17 really > necessary? If not, could AvroCoder be restored to a 1.9.x "compatible" > implementation and kept this way for the Beam 2.1x version lineage? > > This seems like a somewhat fair ask given the way that I'm suddenly > blocked --- however I do realize this is somewhat of a technicality; ie, > Beam 2.16-'s compatibility with my usage of Avro 1.9.x was incidental. > > But, still, if the changes to AvroCoder weren't necessary, restoring back > would unblock me and anyone else using Avro 1.9.x (surely I'm not the only > one!?) > > > On Thu, Jan 16, 2020 at 12:22 PM Elliotte Rusty Harold <elh...@ibiblio.org> > wrote: > >> Avro does not follow semver. They update the major version when the >> serialization format changes and the minor version when the API >> changes in a backwards incompatible way. See >> https://issues.apache.org/jira/browse/AVRO-2687 >> >> On Thu, Jan 16, 2020 at 12:50 PM Luke Cwik <lc...@google.com> wrote: >> > >> > Does avro not follow semantic versioning and upgrading to 1.9 should >> have been backwards compatible or does our usage reach into the internals >> of avro? >> > >> > On Thu, Jan 16, 2020 at 6:16 AM Ismaël Mejía <ieme...@gmail.com> wrote: >> >> >> >> I forgot to explain why the most obvious path (just upgrade Avro to >> version >> >> 1.9.x) is not a valid long term solution. Other systems Beam runs on >> top of >> >> (e.g. Spark!) also leak Avro into their core so in the moment Spark >> moves up >> >> to Avro 1.9.x Spark runner users will be in a really fragile position >> where >> >> things will work until they don't (similar to Aaron's case) so a >> stronger reason >> >> to getAvro out of Beam core. >> >> >> >> >> >> On Thu, Jan 16, 2020 at 1:59 PM Elliotte Rusty Harold < >> elh...@ibiblio.org> wrote: >> >>> >> >>> Shading should be a last resort: >> >>> >> >>> https://jlbp.dev/JLBP-18.html >> >>> >> >>> It tends to cause more problems than it solves. At best it's a stopgap >> >>> measure when you don't have the resources to fix the real problem. In >> >>> this case it sounds like the real issue is that AVRO is not stable. >> >>> There are at least three other solutions in a case like this: >> >>> >> >>> 1. Fix Avro at the root. >> >>> 2. Fork Avro and then fix it. >> >>> 3. Stop depending on Avro. >> >>> >> >>> None of these are trivial which is why shading gets considered. >> >>> However shading doesn't fix the underlying problems, and ultimately >> >>> makes a product as unreliable as its least reliable dependency. :-( >> >>> >> >>> On Thu, Jan 16, 2020 at 2:01 AM jincheng sun < >> sunjincheng...@gmail.com> wrote: >> >>> > >> >>> > I found that there are several dependencies shaded and planned to >> made as vendored artifacts in [1]. I'm not sure why Avro is not shaded >> before. From my point of view, it's a good idea to shade Avro and make it a >> vendored artifact if there are no special reasons blocking us to do that. >> Regarding to how to create a vendored artifact, you can refer to [2] for >> more details. >> >>> > >> >>> > Best, >> >>> > Jincheng >> >>> > >> >>> > [1] https://issues.apache.org/jira/browse/BEAM-5819 >> >>> > [2] https://github.com/apache/beam/blob/master/vendor/README.md >> >>> > >> >>> > >> >>> > Tomo Suzuki <suzt...@google.com> 于2020年1月16日周四 下午1:18写道: >> >>> >> >> >>> >> I've been upgrading dependencies around gRPC. This Avro-problem is >> >>> >> interesting to me. >> >>> >> I'll study BEAM-8388 more tomorrow. >> >>> >> >> >>> >> On Wed, Jan 15, 2020 at 10:51 PM Luke Cwik <lc...@google.com> >> wrote: >> >>> >> > >> >>> >> > +Tomo Suzuki +jincheng sun >> >>> >> > There have been a few contributors upgrading the dependencies >> and validating things not breaking by running the majority of the post >> commit integration tests and also using the linkage checker to show that we >> aren't worse off with respect to our dependency tree. Reaching out to them >> to help your is your best bet of getting these upgrades through. >> >>> >> > >> >>> >> > On Wed, Jan 15, 2020 at 6:52 PM Aaron Dixon <atdi...@gmail.com> >> wrote: >> >>> >> >> >> >>> >> >> I meant to mention that we must use Avro 1.9.x as we rely on >> some schema resolution fixes not present in 1.8.x - so am indeed blocked. >> >>> >> >> >> >>> >> >> On Wed, Jan 15, 2020 at 8:50 PM Aaron Dixon <atdi...@gmail.com> >> wrote: >> >>> >> >>> >> >>> >> >>> It looks like Avro version dependency from Beam has come up in >> the past [1, 2]. >> >>> >> >>> >> >>> >> >>> I'm currently on Beam 2.16.0, which has been compatible with >> my usage of Avro 1.9.x. >> >>> >> >>> >> >>> >> >>> But upgrading to Beam 2.17.0 is not possible for us now that >> 2.17.0 has some dependencies on Avro classes only available in 1.8.x. >> >>> >> >>> >> >>> >> >>> Wondering if anyone else is similar blocked and what it would >> take to prioritize Beam upgrading to 1.9.x or better using a shaded version >> so that clients can use their own Avro version for their own coding >> purposes. (Eg, I parse Avro messages from a KafkaIO source and need 1.9.x >> for this but am perfectly happy if Beam's Avro coding facilities used a >> shaded other version.) >> >>> >> >>> >> >>> >> >>> I've made a comment on BEAM-8388 [1] to this effect. But >> polling community for discussion. >> >>> >> >>> >> >>> >> >>> [1] https://issues.apache.org/jira/browse/BEAM-8388 >> >>> >> >>> [2] https://github.com/apache/beam/pull/9779 >> >>> >> >>> >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> Regards, >> >>> >> Tomo >> >>> >> >>> >> >>> >> >>> -- >> >>> Elliotte Rusty Harold >> >>> elh...@ibiblio.org >> >> >> >> -- >> Elliotte Rusty Harold >> elh...@ibiblio.org >> >