[ 
https://issues.apache.org/jira/browse/AVRO-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093021#comment-17093021
 ] 

Thorsten Hake commented on AVRO-2647:
-------------------------------------

I actually would say that the schema is not ambiguous. The default value is 
correct given the type of the field. I would argue that the implementation (at 
least the Java implementation) does not correctly check the default type.

The following check will be executed for each element of the defined default 
array (Java-Method Schema#isValidDefault):{color:#ffc66d}
{color}
{code:java}
for (JsonNode value : defaultValue)
  if (!isValidDefault(schema.getValueType(), value))
    return false;
return true;
{code}
At this point, one shouldn't check if the value is a valid default but if it is 
a valid value for the type of the array. The array is a valid default if all 
its values are valid according to the array type. The elements do not have to 
be a valid default of the array type, only array as a whole.

The same problem should also exist for the check of map types (only checked by 
reading the code).

 

> specification does not fully specify semantics for unions in default types
> --------------------------------------------------------------------------
>
>                 Key: AVRO-2647
>                 URL: https://issues.apache.org/jira/browse/AVRO-2647
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: doc
>            Reporter: Roger
>            Priority: Minor
>
> Currently the specification does not make the semantics clear for union types 
> within complex types clear. In particular, the spec talks about union fields, 
> but leaves the semantics for unions in other contexts unspecified.
> Here's an example which is undefined according to the current specification:
> {code:json}
> {
>     "type": "record",
>     "name": "R",
>     "fields": [
>         {
>             "name": "F",
>             "type": {
>                 "type": "array",
>                 "items": [
>                     {
>                         "type": "enum",
>                         "name": "E1",
>                         "symbols": ["A", "B"]
>                     },
>                     {
>                         "type": "enum",
>                         "name": "E2",
>                         "symbols": ["B", "A", "C"]
>                     }
>                 ]
>             },
>             "default": ["A", "B", "C"]
>         }
>     ]
> }
> {code}
> By experiment, most implementations seem to have chosen the semantics that 
> are documented in this PR.
> In Java, the schema above is parsed without error, but when attempting to use 
> the default value, it fails with a NullPointerException (trying to find the 
> symbol C in E1). (Thanks for Ryan Skraba for this).
> In [gogen-avro|https://github.com/actgardner/gogen-avro] it generates invalid 
> code because it's assuming E1 but generating the symbol for "C" anyway.
> FWIW at some point in the future, I believe that it would be nice to align 
> the default value specification with the JSON encoding for Avro so there 
> aren't two subtly different JSON encodings of an Avro value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to