[
https://issues.apache.org/jira/browse/AVRO-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093220#comment-17093220
]
Thorsten Hake commented on AVRO-2647:
-------------------------------------
I see your point. I guess the given schema really seems to be ambiguous. Sorry
for my confusion.
But the problem also exists when schemas are not ambiguous. Consider the
following example:
{code:java}
{
"type": "record",
"name": "R",
"fields": [
{
"name": "F",
"type": {
"type": "array",
"items": ["null","string"]
},
"default": [null,"A"]
}
]
}
{code}
The current java implementation considers the default value as not being
compatible with the type of the array.
Should this be tracked in a separate issue?
> specification does not fully specify semantics for unions in default types
> --------------------------------------------------------------------------
>
> Key: AVRO-2647
> URL: https://issues.apache.org/jira/browse/AVRO-2647
> Project: Apache Avro
> Issue Type: Bug
> Components: doc
> Reporter: Roger
> Priority: Minor
>
> Currently the specification does not make the semantics clear for union types
> within complex types clear. In particular, the spec talks about union fields,
> but leaves the semantics for unions in other contexts unspecified.
> Here's an example which is undefined according to the current specification:
> {code:json}
> {
> "type": "record",
> "name": "R",
> "fields": [
> {
> "name": "F",
> "type": {
> "type": "array",
> "items": [
> {
> "type": "enum",
> "name": "E1",
> "symbols": ["A", "B"]
> },
> {
> "type": "enum",
> "name": "E2",
> "symbols": ["B", "A", "C"]
> }
> ]
> },
> "default": ["A", "B", "C"]
> }
> ]
> }
> {code}
> By experiment, most implementations seem to have chosen the semantics that
> are documented in this PR.
> In Java, the schema above is parsed without error, but when attempting to use
> the default value, it fails with a NullPointerException (trying to find the
> symbol C in E1). (Thanks for Ryan Skraba for this).
> In [gogen-avro|https://github.com/actgardner/gogen-avro] it generates invalid
> code because it's assuming E1 but generating the symbol for "C" anyway.
> FWIW at some point in the future, I believe that it would be nice to align
> the default value specification with the JSON encoding for Avro so there
> aren't two subtly different JSON encodings of an Avro value.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)