Hi, and sorry for the late response. aliases are not for "import" reasons. actually the avsc spec (unlike the more modern IDL) doesnt have any notion of imports. aliases (in my understanding) are about schema evolution and compatibility (for I/O operations - encoding/decoding, NOT for the compatibility of generated code?). also avro supports things that various impl languages dont (java example - you can have avro schema fields with names that are reserved keywords in java and they will be "mangled" in java generated code).
as for the need for this feature - it would be impossible to evolve a schema out of the null namespace without this, as newer versions of the schema (with namespace) acting as reader schemas would not be able to decode data written using older versions of the schema (without namespace). as such I think this should be made a feature. without it users who need to enable such schema evolution (like me) would have to write code to "doctor" the writer schema to match the reader schema ... On Fri, May 13, 2022 at 5:56 AM Oscar Westra van Holthe - Kind < [email protected]> wrote: > Hi, > > It's true that the spec doesn't describe how aliases would target the null > namespace. But on the other hand, I would not expect this to be allowed at > all: > > - In Java, it's an explicit compiler error to import from the unnamed > package > - In Python, import statements must be able to address whatever you're > importing: importing from something unnamed is not possible > - Other languages I've seen also don't support unnamed namespaces > - Scala is an exception, in that imports are always relative, and that > caused it to support the _root_ package > > Given how the spec describes full names as "a dot-separated sequence of > [simple] names", I'd say addressing the null namespace is not supported. > > I'm not against such a feature though, but we should explicitly document > how aliases (and full names) could contain the null namespace. > > > Kind regards, > Oscar > > > On Sat, 7 May 2022 at 16:16, Radai Rosenblatt (Jira) <[email protected]> > wrote: > > > > > [ > > > https://issues.apache.org/jira/browse/AVRO-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > > ] > > > > Radai Rosenblatt updated AVRO-3512: > > ----------------------------------- > > Description: > > the avro spec allows for the "null namespace" (when no namespace is > > specified anywhere). it also has [the following| > > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > > aliases: > > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > > qualified names of its aliases are "a.c" and "x.y" > > {quote} > > which means a "simple" alias ("c" above) inherits any namespace defined > on > > the declaring type. > > > > > > > > now suppose i was to use aliases on a namespaced schema to be able to > read > > data written using a schema that is in the null namespace (has no > > namespace). > > > > here are my writer schema: > > {code:json} > > { > > "type": "record", > > "name": "AncientSchema", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "AncientEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > > } > > } > > ] > > } > > {code} > > and reader schema: > > {code:json} > > { > > "type": "record", > > "namespace": "much.namespace", > > "name": "ModernRecord", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "ModernEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > > "aliases": [ > > ".AncientEnum" > > ] > > } > > ], > > "aliases": [ > > ".AncientSchema" > > ] > > } > > {code} > > notice the dots used in the aliases. as far as i understand the spec this > > should be the only legal way to do this. and it does indeed work .... to > a > > point. > > > > > > > > when testing this i found multiple issues with avro's handling of such > > aliases, dating back to late avro 1.7.* > > > > > > # without these aliases, decoding does fail, but it fails over the > nested > > enum, whereas it should have failed "immediately" on the fullname > mismatch > > on the top level record schema. in fact, on further testing i think avro > > (at least in java) doesnt bother comparing the fullnames on the top level > > writer vs reader schemas at all? > > # while the schema with the aliases parse()es fine, Schema.toString() > > strips out the dots from the aliases, thereby creating a "monsanto > > terminator schema" - once printed and parsed again the aliases would > become > > "simple aliases" and stop working > > # the spec doesnt explicitly talk about how to use aliases to "target" > > the null namespace. if this is an intentional feature I think the spec > > should be expanded a little to cover it? > > > > > > > > i have code to reproduce all these issues in [ > > > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java > ] > > (coded against master) > > > > > > > > i also have code to reproduce all the above against multiple older avro > > versions in [ > > > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > > ] > > > > was: > > the avro spec allows for the "null namespace" (when no namespace is > > specified anywhere). it also has [the following| > > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > > aliases: > > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > > qualified names of its aliases are "a.c" and "x.y" > > {quote} > > which means a "simple" alias ("c" above) inherits any namespace defined > on > > the declaring type. > > > > > > > > now suppose i was to use aliases on a namespaced schema to be able to > read > > data written using a schema that is in the null namespace (has no > > namespace). > > > > here are my writer schema: > > {code:json} > > { > > "type": "record", > > "name": "AncientSchema", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "AncientEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > > } > > } > > ] > > } > > {code} > > and reader schema: > > {code:json} > > { > > "type": "record", > > "namespace": "much.namespace", > > "name": "ModernRecord", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "ModernEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > > "aliases": [ > > ".AncientEnum" > > ] > > } > > ], > > "aliases": [ > > ".AncientSchema" > > ] > > } > > {code} > > notice the dots used in the aliases. as far as i understand the spec this > > should be the only legal way to do this. and it does indeed work .... to > a > > point. > > > > > > > > when testing this i found multiple issues with avro's handling of such > > aliases, dating back to late avro 1.7.* > > > > > > # without these aliases, decoding does fail, but it fails over the > nested > > enum, whereas it should have failed "immediately" on the fullname > mismatch > > on the top level record schema. in fact, on further testing i think avro > > (at least in java) doesnt bother comparing the fullnames on the top level > > writer vs reader schemas at all? > > # while the schema with the aliases parse()es fine, Schema.toString() > > strips out the dots from the aliases, thereby creating a "monsanto > > terminator schema" - once printed and parsed again the aliases would > become > > "simple aliases" and stop working > > # the spec doesnt explicitly talk about how to use aliases to "target" > > the null namespace. if this is an intentional specification I think the > > spec should be expanded a little to cover it? > > > > > > > > i have code to reproduce all these issues in [ > > > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java > ] > > (coded against master) > > > > > > > > i also have code to reproduce all the above against multiple older avro > > versions in [ > > > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > > ] > > > > > > > aliases to the null namespace do not work as expected > > > ----------------------------------------------------- > > > > > > Key: AVRO-3512 > > > URL: https://issues.apache.org/jira/browse/AVRO-3512 > > > Project: Apache Avro > > > Issue Type: Bug > > > Components: java, spec > > > Affects Versions: 1.11.0 > > > Reporter: Radai Rosenblatt > > > Priority: Major > > > > > > the avro spec allows for the "null namespace" (when no namespace is > > specified anywhere). it also has [the following| > > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > > aliases: > > > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the > > fully qualified names of its aliases are "a.c" and "x.y" > > > {quote} > > > which means a "simple" alias ("c" above) inherits any namespace defined > > on the declaring type. > > > > > > now suppose i was to use aliases on a namespaced schema to be able to > > read data written using a schema that is in the null namespace (has no > > namespace). > > > here are my writer schema: > > > {code:json} > > > { > > > "type": "record", > > > "name": "AncientSchema", > > > "fields": [ > > > { > > > "name" : "enumField", > > > "type" : { > > > "type" : "enum", > > > "name" : "AncientEnum", > > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > > > } > > > } > > > ] > > > } > > > {code} > > > and reader schema: > > > {code:json} > > > { > > > "type": "record", > > > "namespace": "much.namespace", > > > "name": "ModernRecord", > > > "fields": [ > > > { > > > "name" : "enumField", > > > "type" : { > > > "type" : "enum", > > > "name" : "ModernEnum", > > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > > > "aliases": [ > > > ".AncientEnum" > > > ] > > > } > > > ], > > > "aliases": [ > > > ".AncientSchema" > > > ] > > > } > > > {code} > > > notice the dots used in the aliases. as far as i understand the spec > > this should be the only legal way to do this. and it does indeed work > .... > > to a point. > > > > > > when testing this i found multiple issues with avro's handling of such > > aliases, dating back to late avro 1.7.* > > > > > > # without these aliases, decoding does fail, but it fails over the > > nested enum, whereas it should have failed "immediately" on the fullname > > mismatch on the top level record schema. in fact, on further testing i > > think avro (at least in java) doesnt bother comparing the fullnames on > the > > top level writer vs reader schemas at all? > > > # while the schema with the aliases parse()es fine, Schema.toString() > > strips out the dots from the aliases, thereby creating a "monsanto > > terminator schema" - once printed and parsed again the aliases would > become > > "simple aliases" and stop working > > > # the spec doesnt explicitly talk about how to use aliases to "target" > > the null namespace. if this is an intentional feature I think the spec > > should be expanded a little to cover it? > > > > > > i have code to reproduce all these issues in [ > > > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java > ] > > (coded against master) > > > > > > i also have code to reproduce all the above against multiple older avro > > versions in [ > > > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > > ] > > > > > > > > -- > > This message was sent by Atlassian Jira > > (v8.20.7#820007) > > > > > -- > > ✉️ Oscar Westra van Holthe - Kind <[email protected]> >
