Hi, It's true that the spec doesn't describe how aliases would target the null namespace. But on the other hand, I would not expect this to be allowed at all:
- In Java, it's an explicit compiler error to import from the unnamed package - In Python, import statements must be able to address whatever you're importing: importing from something unnamed is not possible - Other languages I've seen also don't support unnamed namespaces - Scala is an exception, in that imports are always relative, and that caused it to support the _root_ package Given how the spec describes full names as "a dot-separated sequence of [simple] names", I'd say addressing the null namespace is not supported. I'm not against such a feature though, but we should explicitly document how aliases (and full names) could contain the null namespace. Kind regards, Oscar On Sat, 7 May 2022 at 16:16, Radai Rosenblatt (Jira) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/AVRO-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Radai Rosenblatt updated AVRO-3512: > ----------------------------------- > Description: > the avro spec allows for the "null namespace" (when no namespace is > specified anywhere). it also has [the following| > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no > namespace). > > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ > ".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work .... to a > point. > > > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch > on the top level record schema. in fact, on further testing i think avro > (at least in java) doesnt bother comparing the fullnames on the top level > writer vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() > strips out the dots from the aliases, thereby creating a "monsanto > terminator schema" - once printed and parsed again the aliases would become > "simple aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" > the null namespace. if this is an intentional feature I think the spec > should be expanded a little to cover it? > > > > i have code to reproduce all these issues in [ > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > > > i also have code to reproduce all the above against multiple older avro > versions in [ > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > ] > > was: > the avro spec allows for the "null namespace" (when no namespace is > specified anywhere). it also has [the following| > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > aliases: > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully > qualified names of its aliases are "a.c" and "x.y" > {quote} > which means a "simple" alias ("c" above) inherits any namespace defined on > the declaring type. > > > > now suppose i was to use aliases on a namespaced schema to be able to read > data written using a schema that is in the null namespace (has no > namespace). > > here are my writer schema: > {code:json} > { > "type": "record", > "name": "AncientSchema", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "AncientEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > } > } > ] > } > {code} > and reader schema: > {code:json} > { > "type": "record", > "namespace": "much.namespace", > "name": "ModernRecord", > "fields": [ > { > "name" : "enumField", > "type" : { > "type" : "enum", > "name" : "ModernEnum", > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > "aliases": [ > ".AncientEnum" > ] > } > ], > "aliases": [ > ".AncientSchema" > ] > } > {code} > notice the dots used in the aliases. as far as i understand the spec this > should be the only legal way to do this. and it does indeed work .... to a > point. > > > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > > # without these aliases, decoding does fail, but it fails over the nested > enum, whereas it should have failed "immediately" on the fullname mismatch > on the top level record schema. in fact, on further testing i think avro > (at least in java) doesnt bother comparing the fullnames on the top level > writer vs reader schemas at all? > # while the schema with the aliases parse()es fine, Schema.toString() > strips out the dots from the aliases, thereby creating a "monsanto > terminator schema" - once printed and parsed again the aliases would become > "simple aliases" and stop working > # the spec doesnt explicitly talk about how to use aliases to "target" > the null namespace. if this is an intentional specification I think the > spec should be expanded a little to cover it? > > > > i have code to reproduce all these issues in [ > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > > > i also have code to reproduce all the above against multiple older avro > versions in [ > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > ] > > > > aliases to the null namespace do not work as expected > > ----------------------------------------------------- > > > > Key: AVRO-3512 > > URL: https://issues.apache.org/jira/browse/AVRO-3512 > > Project: Apache Avro > > Issue Type: Bug > > Components: java, spec > > Affects Versions: 1.11.0 > > Reporter: Radai Rosenblatt > > Priority: Major > > > > the avro spec allows for the "null namespace" (when no namespace is > specified anywhere). it also has [the following| > https://avro.apache.org/docs/current/spec.html#Aliases] to say about > aliases: > > {quote}if a type named "a.b" has aliases of "c" and "x.y", then the > fully qualified names of its aliases are "a.c" and "x.y" > > {quote} > > which means a "simple" alias ("c" above) inherits any namespace defined > on the declaring type. > > > > now suppose i was to use aliases on a namespaced schema to be able to > read data written using a schema that is in the null namespace (has no > namespace). > > here are my writer schema: > > {code:json} > > { > > "type": "record", > > "name": "AncientSchema", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "AncientEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ] > > } > > } > > ] > > } > > {code} > > and reader schema: > > {code:json} > > { > > "type": "record", > > "namespace": "much.namespace", > > "name": "ModernRecord", > > "fields": [ > > { > > "name" : "enumField", > > "type" : { > > "type" : "enum", > > "name" : "ModernEnum", > > "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ], > > "aliases": [ > > ".AncientEnum" > > ] > > } > > ], > > "aliases": [ > > ".AncientSchema" > > ] > > } > > {code} > > notice the dots used in the aliases. as far as i understand the spec > this should be the only legal way to do this. and it does indeed work .... > to a point. > > > > when testing this i found multiple issues with avro's handling of such > aliases, dating back to late avro 1.7.* > > > > # without these aliases, decoding does fail, but it fails over the > nested enum, whereas it should have failed "immediately" on the fullname > mismatch on the top level record schema. in fact, on further testing i > think avro (at least in java) doesnt bother comparing the fullnames on the > top level writer vs reader schemas at all? > > # while the schema with the aliases parse()es fine, Schema.toString() > strips out the dots from the aliases, thereby creating a "monsanto > terminator schema" - once printed and parsed again the aliases would become > "simple aliases" and stop working > > # the spec doesnt explicitly talk about how to use aliases to "target" > the null namespace. if this is an intentional feature I think the spec > should be expanded a little to cover it? > > > > i have code to reproduce all these issues in [ > https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java] > (coded against master) > > > > i also have code to reproduce all the above against multiple older avro > versions in [ > https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java > ] > > > > -- > This message was sent by Atlassian Jira > (v8.20.7#820007) > -- ✉️ Oscar Westra van Holthe - Kind <[email protected]>
