[
https://issues.apache.org/jira/browse/AVRO-3512?focusedWorklogId=770683&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770683
]
ASF GitHub Bot logged work on AVRO-3512:
----------------------------------------
Author: ASF GitHub Bot
Created on: 16/May/22 07:14
Start Date: 16/May/22 07:14
Worklog Time Spent: 10m
Work Description: martin-g commented on PR #1685:
URL: https://github.com/apache/avro/pull/1685#issuecomment-1127311267
I will extract the "Alias is a Name" to a separate issue/PR. It is a bigger
change that is not really related to this issue/PR.
Issue Time Tracking
-------------------
Worklog Id: (was: 770683)
Time Spent: 50m (was: 40m)
> aliases to the null namespace do not work as expected
> -----------------------------------------------------
>
> Key: AVRO-3512
> URL: https://issues.apache.org/jira/browse/AVRO-3512
> Project: Apache Avro
> Issue Type: Bug
> Components: java, spec
> Affects Versions: 1.11.0
> Reporter: Radai Rosenblatt
> Priority: Major
> Labels: pull-request-available
> Attachments: AVRO-3512.patch
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> the avro spec allows for the "null namespace" (when no namespace is specified
> anywhere). it also has [the
> following|https://avro.apache.org/docs/current/spec.html#Aliases] to say
> about aliases:
> {quote}if a type named "a.b" has aliases of "c" and "x.y", then the fully
> qualified names of its aliases are "a.c" and "x.y"
> {quote}
> which means a "simple" alias ("c" above) inherits any namespace defined on
> the declaring type.
>
> now suppose i was to use aliases on a namespaced schema to be able to read
> data written using a schema that is in the null namespace (has no namespace).
> here are my writer schema:
> {code:json}
> {
> "type": "record",
> "name": "AncientSchema",
> "fields": [
> {
> "name" : "enumField",
> "type" : {
> "type" : "enum",
> "name" : "AncientEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
> }
> }
> ]
> }
> {code}
> and reader schema:
> {code:json}
> {
> "type": "record",
> "namespace": "much.namespace",
> "name": "ModernRecord",
> "fields": [
> {
> "name" : "enumField",
> "type" : {
> "type" : "enum",
> "name" : "ModernEnum",
> "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
> "aliases": [
> ".AncientEnum"
> ]
> }
> ],
> "aliases": [
> ".AncientSchema"
> ]
> }
> {code}
> notice the dots used in the aliases. as far as i understand the spec this
> should be the only legal way to do this. and it does indeed work .... to a
> point.
>
> when testing this i found multiple issues with avro's handling of such
> aliases, dating back to late avro 1.7.*
>
> # without these aliases, decoding does fail, but it fails over the nested
> enum, whereas it should have failed "immediately" on the fullname mismatch on
> the top level record schema. in fact, on further testing i think avro (at
> least in java) doesnt bother comparing the fullnames on the top level writer
> vs reader schemas at all?
> # while the schema with the aliases parse()es fine, Schema.toString() strips
> out the dots from the aliases, thereby creating a "monsanto terminator
> schema" - once printed and parsed again the aliases would become "simple
> aliases" and stop working
> # the spec doesnt explicitly talk about how to use aliases to "target" the
> null namespace. if this is an intentional feature I think the spec should be
> expanded a little to cover it?
>
> i have code to reproduce all these issues in
> [https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java]
> (coded against master)
>
> i also have code to reproduce all the above against multiple older avro
> versions in
> [https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java]
--
This message was sent by Atlassian Jira
(v8.20.7#820007)