also - could we have this discussion on the JIRA ticket itself? its much
easier (for me?) to find that way

On Fri, May 13, 2022 at 9:10 AM r <[email protected]> wrote:

> Hi, and sorry for the late response.
>
> aliases are not for "import" reasons. actually the avsc spec (unlike the
> more modern IDL) doesnt have any notion of imports. aliases (in my
> understanding) are about
> schema evolution and compatibility (for I/O operations -
> encoding/decoding, NOT for the compatibility of generated code?).
> also avro supports things that various impl languages dont (java example -
> you can have avro schema fields with names that are reserved keywords in
> java and they will be "mangled" in java generated code).
>
> as for the need for this feature - it would be impossible to evolve a
> schema out of the null namespace without this, as newer versions of the
> schema (with namespace) acting as reader schemas would not be able to
> decode data written using older versions of the schema (without namespace).
>
> as such I think this should be made a feature. without it users who need
> to enable such schema evolution (like me) would have to write code to
> "doctor" the writer schema to match the reader schema ...
>
> On Fri, May 13, 2022 at 5:56 AM Oscar Westra van Holthe - Kind <
> [email protected]> wrote:
>
>> Hi,
>>
>> It's true that the spec doesn't describe how aliases would target the null
>> namespace. But on the other hand, I would not expect this to be allowed at
>> all:
>>
>>    - In Java, it's an explicit compiler error to import from the unnamed
>>    package
>>    - In Python, import statements must be able to address whatever you're
>>    importing: importing from something unnamed is not possible
>>    - Other languages I've seen also don't support unnamed namespaces
>>    - Scala is an exception, in that imports are always relative, and that
>>    caused it to support the _root_ package
>>
>> Given how the spec describes full names as "a dot-separated sequence of
>> [simple] names", I'd say addressing the null namespace is not supported.
>>
>> I'm not against such a feature though, but we should explicitly document
>> how aliases (and full names) could contain the null namespace.
>>
>>
>> Kind regards,
>> Oscar
>>
>>
>> On Sat, 7 May 2022 at 16:16, Radai Rosenblatt (Jira) <[email protected]>
>> wrote:
>>
>> >
>> >      [
>> >
>> https://issues.apache.org/jira/browse/AVRO-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>> > ]
>> >
>> > Radai Rosenblatt updated AVRO-3512:
>> > -----------------------------------
>> >     Description:
>> > the avro spec allows for the "null namespace" (when no namespace is
>> > specified anywhere). it also has [the following|
>> > https://avro.apache.org/docs/current/spec.html#Aliases] to say about
>> > aliases:
>> > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the
>> fully
>> > qualified names of its aliases are "a.c" and "x.y"
>> > {quote}
>> > which means a "simple" alias ("c" above) inherits any namespace defined
>> on
>> > the declaring type.
>> >
>> >
>> >
>> > now suppose i was to use aliases on a namespaced schema to be able to
>> read
>> > data written using a schema that is in the null namespace (has no
>> > namespace).
>> >
>> > here are my writer schema:
>> > {code:json}
>> > {
>> >   "type": "record",
>> >   "name": "AncientSchema",
>> >   "fields": [
>> >     {
>> >       "name" : "enumField",
>> >       "type" : {
>> >         "type" : "enum",
>> >         "name" : "AncientEnum",
>> >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>> >       }
>> >     }
>> >   ]
>> > }
>> > {code}
>> > and reader schema:
>> > {code:json}
>> > {
>> >   "type": "record",
>> >   "namespace": "much.namespace",
>> >   "name": "ModernRecord",
>> >   "fields": [
>> >     {
>> >       "name" : "enumField",
>> >       "type" : {
>> >         "type" : "enum",
>> >         "name" : "ModernEnum",
>> >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
>> >         "aliases": [
>> >            ".AncientEnum"
>> >         ]
>> >       }
>> >   ],
>> >   "aliases": [
>> >     ".AncientSchema"
>> >   ]
>> > }
>> > {code}
>> > notice the dots used in the aliases. as far as i understand the spec
>> this
>> > should be the only legal way to do this. and it does indeed work ....
>> to a
>> > point.
>> >
>> >
>> >
>> > when testing this i found multiple issues with avro's handling of such
>> > aliases, dating back to late avro 1.7.*
>> >
>> >
>> >  # without these aliases, decoding does fail, but it fails over the
>> nested
>> > enum, whereas it should have failed "immediately" on the fullname
>> mismatch
>> > on the top level record schema. in fact, on further testing i think avro
>> > (at least in java) doesnt bother comparing the fullnames on the top
>> level
>> > writer vs reader schemas at all?
>> >  # while the schema with the aliases parse()es fine, Schema.toString()
>> > strips out the dots from the aliases, thereby creating a "monsanto
>> > terminator schema" - once printed and parsed again the aliases would
>> become
>> > "simple aliases" and stop working
>> >  # the spec doesnt explicitly talk about how to use aliases to "target"
>> > the null namespace. if this is an intentional feature I think the spec
>> > should be expanded a little to cover it?
>> >
>> >
>> >
>> > i have code to reproduce all these issues in [
>> >
>> https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java
>> ]
>> > (coded against master)
>> >
>> >
>> >
>> > i also have code to reproduce all the above against multiple older avro
>> > versions in [
>> >
>> https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java
>> > ]
>> >
>> >   was:
>> > the avro spec allows for the "null namespace" (when no namespace is
>> > specified anywhere). it also has [the following|
>> > https://avro.apache.org/docs/current/spec.html#Aliases] to say about
>> > aliases:
>> > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the
>> fully
>> > qualified names of its aliases are "a.c" and "x.y"
>> > {quote}
>> > which means a "simple" alias ("c" above) inherits any namespace defined
>> on
>> > the declaring type.
>> >
>> >
>> >
>> > now suppose i was to use aliases on a namespaced schema to be able to
>> read
>> > data written using a schema that is in the null namespace (has no
>> > namespace).
>> >
>> > here are my writer schema:
>> > {code:json}
>> > {
>> >   "type": "record",
>> >   "name": "AncientSchema",
>> >   "fields": [
>> >     {
>> >       "name" : "enumField",
>> >       "type" : {
>> >         "type" : "enum",
>> >         "name" : "AncientEnum",
>> >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>> >       }
>> >     }
>> >   ]
>> > }
>> > {code}
>> > and reader schema:
>> > {code:json}
>> > {
>> >   "type": "record",
>> >   "namespace": "much.namespace",
>> >   "name": "ModernRecord",
>> >   "fields": [
>> >     {
>> >       "name" : "enumField",
>> >       "type" : {
>> >         "type" : "enum",
>> >         "name" : "ModernEnum",
>> >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
>> >         "aliases": [
>> >            ".AncientEnum"
>> >         ]
>> >       }
>> >   ],
>> >   "aliases": [
>> >     ".AncientSchema"
>> >   ]
>> > }
>> > {code}
>> > notice the dots used in the aliases. as far as i understand the spec
>> this
>> > should be the only legal way to do this. and it does indeed work ....
>> to a
>> > point.
>> >
>> >
>> >
>> > when testing this i found multiple issues with avro's handling of such
>> > aliases, dating back to late avro 1.7.*
>> >
>> >
>> >  # without these aliases, decoding does fail, but it fails over the
>> nested
>> > enum, whereas it should have failed "immediately" on the fullname
>> mismatch
>> > on the top level record schema. in fact, on further testing i think avro
>> > (at least in java) doesnt bother comparing the fullnames on the top
>> level
>> > writer vs reader schemas at all?
>> >  # while the schema with the aliases parse()es fine, Schema.toString()
>> > strips out the dots from the aliases, thereby creating a "monsanto
>> > terminator schema" - once printed and parsed again the aliases would
>> become
>> > "simple aliases" and stop working
>> >  # the spec doesnt explicitly talk about how to use aliases to "target"
>> > the null namespace. if this is an intentional specification I think the
>> > spec should be expanded a little to cover it?
>> >
>> >
>> >
>> > i have code to reproduce all these issues in [
>> >
>> https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java
>> ]
>> > (coded against master)
>> >
>> >
>> >
>> > i also have code to reproduce all the above against multiple older avro
>> > versions in [
>> >
>> https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java
>> > ]
>> >
>> >
>> > > aliases to the null namespace do not work as expected
>> > > -----------------------------------------------------
>> > >
>> > >                 Key: AVRO-3512
>> > >                 URL: https://issues.apache.org/jira/browse/AVRO-3512
>> > >             Project: Apache Avro
>> > >          Issue Type: Bug
>> > >          Components: java, spec
>> > >    Affects Versions: 1.11.0
>> > >            Reporter: Radai Rosenblatt
>> > >            Priority: Major
>> > >
>> > > the avro spec allows for the "null namespace" (when no namespace is
>> > specified anywhere). it also has [the following|
>> > https://avro.apache.org/docs/current/spec.html#Aliases] to say about
>> > aliases:
>> > > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the
>> > fully qualified names of its aliases are "a.c" and "x.y"
>> > > {quote}
>> > > which means a "simple" alias ("c" above) inherits any namespace
>> defined
>> > on the declaring type.
>> > >
>> > > now suppose i was to use aliases on a namespaced schema to be able to
>> > read data written using a schema that is in the null namespace (has no
>> > namespace).
>> > > here are my writer schema:
>> > > {code:json}
>> > > {
>> > >   "type": "record",
>> > >   "name": "AncientSchema",
>> > >   "fields": [
>> > >     {
>> > >       "name" : "enumField",
>> > >       "type" : {
>> > >         "type" : "enum",
>> > >         "name" : "AncientEnum",
>> > >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
>> > >       }
>> > >     }
>> > >   ]
>> > > }
>> > > {code}
>> > > and reader schema:
>> > > {code:json}
>> > > {
>> > >   "type": "record",
>> > >   "namespace": "much.namespace",
>> > >   "name": "ModernRecord",
>> > >   "fields": [
>> > >     {
>> > >       "name" : "enumField",
>> > >       "type" : {
>> > >         "type" : "enum",
>> > >         "name" : "ModernEnum",
>> > >         "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
>> > >         "aliases": [
>> > >            ".AncientEnum"
>> > >         ]
>> > >       }
>> > >   ],
>> > >   "aliases": [
>> > >     ".AncientSchema"
>> > >   ]
>> > > }
>> > > {code}
>> > > notice the dots used in the aliases. as far as i understand the spec
>> > this should be the only legal way to do this. and it does indeed work
>> ....
>> > to a point.
>> > >
>> > > when testing this i found multiple issues with avro's handling of such
>> > aliases, dating back to late avro 1.7.*
>> > >
>> > >  # without these aliases, decoding does fail, but it fails over the
>> > nested enum, whereas it should have failed "immediately" on the fullname
>> > mismatch on the top level record schema. in fact, on further testing i
>> > think avro (at least in java) doesnt bother comparing the fullnames on
>> the
>> > top level writer vs reader schemas at all?
>> > >  # while the schema with the aliases parse()es fine, Schema.toString()
>> > strips out the dots from the aliases, thereby creating a "monsanto
>> > terminator schema" - once printed and parsed again the aliases would
>> become
>> > "simple aliases" and stop working
>> > >  # the spec doesnt explicitly talk about how to use aliases to
>> "target"
>> > the null namespace. if this is an intentional feature I think the spec
>> > should be expanded a little to cover it?
>> > >
>> > > i have code to reproduce all these issues in [
>> >
>> https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java
>> ]
>> > (coded against master)
>> > >
>> > > i also have code to reproduce all the above against multiple older
>> avro
>> > versions in [
>> >
>> https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java
>> > ]
>> >
>> >
>> >
>> > --
>> > This message was sent by Atlassian Jira
>> > (v8.20.7#820007)
>> >
>>
>>
>> --
>>
>> ✉️ Oscar Westra van Holthe - Kind <[email protected]>
>>
>

Reply via email to