In addition to that point about schema inference, the correct thing to
do with inference would be to do the type,null ordering and not
null,type.

On Thu, Apr 7, 2022 at 12:51 PM Matt Burgess <[email protected]> wrote:
>
> I agree that moving forward with the upgrade is the right thing to do,
> the example Mike pointed out is invalid according to the 1.8.1 spec
> [1] even though it was mistakenly allowed by Avro itself. Since we
> don't infer default values I think we can correctly require that
> hand-written schemas conform to the spec, especially with respect to
> default values, and our inferred schemas will adhere to the spec.
>
> I also agree we should mention this in the Migration Guidance doc as
> it is a requirement that is now being enforced when it wasn't before.
>
> There might be an issue with Hive 1 but I don't think we're on the
> hook for that, if Hive has issues with nullable types and/or default
> values of null, it should obey the spec the same as we will.
>
> Regards,
> Matt
>
> [1] https://avro.apache.org/docs/1.8.1/spec.html#Unions
>
> On Thu, Apr 7, 2022 at 9:20 AM David Handermann
> <[email protected]> wrote:
> >
> > Mike,
> >
> > Thanks for raising this issue for discussion, and thanks Isha and Edward
> > for the comments. It is helpful to get an understanding of the potential
> > impact.
> >
> > In general, it seems like moving forward with the upgrade is the best idea,
> > given the issues with older versions. For integration with other products
> > outside of NiFi, using the latest version of Avro should provide better
> > compatibility.
> >
> > That being said, it is important to consider the impact on existing
> > deployments. At minimum, this change is something that should be described
> > in the NiFi Migration Guidance associated with the released version. As far
> > as supporting different versions, it is possible to run older versions of
> > the various components, although this is more complicated given the scope
> > of the upgrade. This seems like a case where providing more detailed
> > migration steps would be helpful, to include issues related to cached
> > schemas or inferred schemas used as the basis for customization.
> >
> > Regards,
> > David Handermann
> >
> > On Thu, Apr 7, 2022 at 6:08 AM Edward Armes <[email protected]> wrote:
> >
> > > I've had a quick look in JIRA and it looks like this might have happened 
> > > as
> > > a side effect of AVRO-1544.
> > >
> > > I think it is worth upgrading especially given that it looks like few of
> > > the changes refer to updating against newer bundled dependencies some of
> > > which seem to  have CVE's against them.
> > >
> > > Depending on peoples feelings would it be wroth creating 2 versions one
> > > using Avro 1.8 and one using Avro 1.11.0 and then removing the 1.8 version
> > > in a later release?
> > >
> > > On an additional note in cases where people are writing schemas manually I
> > > suspect they are probably going to be validating against the later 
> > > versions
> > > of Avro using the projects tooling and that may create issues further down
> > > the line.
> > >
> > > Edward
> > >
> > > On Thu, Apr 7, 2022 at 11:52 AM Isha Lamboo <
> > > [email protected]>
> > > wrote:
> > >
> > > > Hi Mike,
> > > >
> > > > The "Infer schema" functionality in NiFi currently generates schemas 
> > > > with
> > > > the order that will be invalid under Avro 1.9+. I noticed because I've
> > > been
> > > > using that to copy-paste schemas that were "almost right" so I could
> > > > manually fix them.
> > > >
> > > > I guess that inferred schemas should be fine if the inferring logic is
> > > > also upgraded by the dependency, but for any cached schemas and my own
> > > > manually saved schemas this will be somewhat painful.
> > > > My use case for manually saving inferred schemas is mostly database
> > > > migration where some inferred "choice" fields are not supported for the
> > > > target database.
> > > >
> > > > Hope this helps,
> > > >
> > > > Isha
> > > >
> > > > -----Oorspronkelijk bericht-----
> > > > Van: Mike Thomsen <[email protected]>
> > > > Verzonden: donderdag 7 april 2022 12:11
> > > > Aan: [email protected]
> > > > Onderwerp: Need some feedback on how upgrading Avro might cause problems
> > > >
> > > > Thread is here for full details:
> > > >
> > > >
> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%23pullrequestreview-922490039&amp;data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=ztcxOWXBkpZFqEh%2Bu%2BG0du5BLUPyZ3WaMxqpeqn%2FexI%3D&amp;reserved=0
> > > >
> > > > It looks like Avro 1.8's schema parser may have been more lenient (or
> > > > buggy) in enforcing the specification with respect to the ordering of a
> > > > union for a nullable type. 1.9.X and higher are definitely more
> > > opinionated
> > > > and throw exceptions on schemas that used to work on 1.8.X. For example,
> > > > this used to be valid:
> > > >
> > > > {
> > > >   "name": "message",
> > > >   "type": [ "null", "string" ],
> > > >   "default": "Hello, world"
> > > > }
> > > >
> > > > Now Avro **correctly** throws an exception per the specification.
> > > > Under 1.8 it did not, and as you can see here, I had to change numerous
> > > > test schemas in order to make them work under 1.9 to 1.11.0:
> > > >
> > > >
> > > >
> > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fnifi%2Fpull%2F5900%2Ffiles%23r835954170&amp;data=04%7C01%7Cisha.lamboo%40virtualsciences.nl%7C8c89ae3e621c4a255c3308da187eea09%7C21429da9e4ad45f99a6fcd126a64274b%7C0%7C0%7C637849231546931433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=iBGW6sYZUdxvAADYIB5L2t94RBZH3A5%2BPJMhxuGv8q8%3D&amp;reserved=0
> > > >
> > > > As I said to Matt, I think we're in a "damned if you do, damned if you
> > > > don't" position here.
> > > >
> > > > Thoughts?
> > > >
> > >

Reply via email to