[ 
https://issues.apache.org/jira/browse/AVRO-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huw Campbell updated AVRO-4093:
-------------------------------
    Description: 
{{In version 1.12, the specification for unions changed, so that the default 
value must be one of the values instead of the first one.}}

{{While this sounds ok at first blush, it opens a whole kettle of worms with 
regards to schemas which appear correct but contain bugs.}}

{{For example, this schema has three fields, which I want defaults for, in all 
cases, I want the second one, but the first will be picked.}}

 
{color:#a31515}{ "type": "record"{color}
{color:#a31515}, "name": "demo1"{color}
{color:#a31515}, "fields" :{color}
{color:#a31515} [ { "name": "enum"{color}
{color:#a31515} , "type":{color}
{color:#a31515} [ \{ "type": "enum", "name": "suit", "symbols": [ "diamonds", 
"hearts", "clubs", "spades" ] }{color}
{color:#a31515} , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", 
"spades", "hammers", "drills" ] }{color}
{color:#a31515} ]{color}
{color:#a31515} , "doc": "I want tools to be the default, but if I select 
spades we get a suit!"{color}
{color:#a31515} , "default": "spades"{color}
{color:#a31515} }{color}
{color:#a31515} , { "name": "string_enum"{color}
{color:#a31515} , "type":{color}
{color:#a31515} [ \{ "type": "string" }{color}
{color:#a31515} , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", 
"spades", "hammers", "drills" ] }{color}
{color:#a31515} ]{color}
{color:#a31515} , "doc": "I want tools to be the default, but this will be a 
string"{color}
{color:#a31515} , "default": "drill"{color}
{color:#a31515} }{color}
{color:#a31515} , { "name": "int_long"{color}
{color:#a31515} , "type": [ "int", "long"]{color}
{color:#a31515} , "doc": "I can't make this a long"{color}
{color:#a31515} , "default": 400000000{color}
{color:#a31515} }{color}
{color:#a31515} , { "name": "records_ambiguity"{color}
{color:#a31515} , "type":{color}
{color:#a31515} [ { "name": "xs"{color}
{color:#a31515} , "type":{color}
{color:#a31515} { "type": "record"{color}
{color:#a31515} , "name": "xs"{color}
{color:#a31515} , "fields" : [ \{ "name": "z", "type": [ "null", "int" ], 
"default": null }, \{ "name": "w", "type": [ "null", "int" ], "default": null } 
]{color}
{color:#a31515} }{color}
{color:#a31515} }{color}
{color:#a31515} , { "name": "ys"{color}
{color:#a31515} , "type":{color}
{color:#a31515} { "type": "record"{color}
{color:#a31515} , "name": "ys"{color}
{color:#a31515} , "fields" : [ \{ "name": "a", "type": "int" } , \{ "name": 
"c", "type": "string" } ]{color}
{color:#a31515} }{color}
{color:#a31515} }{color}
{color:#a31515} ]{color}
{color:#a31515} , "doc": "I want ys, hence the fields a and c, xs has 
everything with defaults, so it matches and the items are discarded."{color}
{color:#a31515} , "default": \{ "a": 2, "c": "yes" }{color}
{color:#a31515} }{color}
{color:#a31515} ]{color}
{color:#a31515}}{color}
 

{{Other things which it can get wrong are "fixed", fields which have names, so 
two are fine, but they could be parsed as each other, "bytes", and so on.}}

  was:
{{In version 1.12, the specification for unions changed, so that the default 
value must be one of the values instead of the first one.}}

{{While this sounds ok at first blush, it opens a whole kettle of worms with 
regards to schemas which appear correct but contain bugs.}}

{{For example, this schema has three fields, which I want defaults for, in all 
cases, I want the second one, but the first will be picked.}}


{{{ "type": "record"}}
{{, "name": "demo1"}}
{{, "fields" :}}
{{  [ { "name": "enum"}}
{{    , "type":}}
{{      [ \{ "type": "enum", "name": "suit", "symbols": [ "diamonds", "hearts", 
"clubs", "spades" ] }}}
{{      , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", "spades", 
"hammers", "drills" ] }}}
{{      ]}}
{{    , "doc": "I want tools to be the default, but if I select spades we get a 
suit!"}}
{{    , "default": "spades"}}
{{    }}}
{{  , { "name": "string_enum"}}
{{    , "type":}}
{{      [ \{ "type": "string" }}}
{{      , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", "spades", 
"hammers", "drills" ] }}}
{{      ]}}
{{    , "doc": "I want tools to be the default, but this will be a string"}}
{{    , "default": "drill"}}
{{    }}}
{{  , { "name": "int_long"}}
{{    , "type": [ "int", "long"]}}
{{    , "doc": "I can't make this a long"}}
{{    , "default": 400000000}}
{{    }}}
{{  , { "name": "records_ambiguity"}}
{{    , "type":}}
{{      [ { "name": "xs"}}
{{        ,  "type":}}
{{            { "type": "record"}}
{{            , "name": "xs"}}
{{            , "fields" : [ \{ "name": "z", "type": [ "null",  "int" ], 
"default": null }, \{ "name": "w", "type": [ "null",  "int" ], "default": null 
} ]}}
{{            }}}
{{        }}}
{{      , { "name": "ys"}}
{{        , "type":}}
{{          { "type": "record"}}
{{          , "name": "ys"}}
{{          , "fields" : [ \{ "name": "a", "type": "int" } , \{ "name": "c", 
"type": "string" } ]}}
{{          }}}
{{        }}}
{{      ]}}
{{    , "doc": "I want ys, hence the fields a and c, xs has everything with 
defaults, so it matches and the items are discarded."}}
{{    , "default": \{ "a": 2, "c": "yes" }}}
{{    }}}
{{  ]}}
{{}}}

 

{{Other things which it can get wrong are "fixed", fields which have names, so 
two are fine, but they could be parsed as each other, "bytes", and so on.}}


> New union defaulting rules is dangerous.
> ----------------------------------------
>
>                 Key: AVRO-4093
>                 URL: https://issues.apache.org/jira/browse/AVRO-4093
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: spec
>    Affects Versions: 1.12.0
>            Reporter: Huw Campbell
>            Priority: Major
>
> {{In version 1.12, the specification for unions changed, so that the default 
> value must be one of the values instead of the first one.}}
> {{While this sounds ok at first blush, it opens a whole kettle of worms with 
> regards to schemas which appear correct but contain bugs.}}
> {{For example, this schema has three fields, which I want defaults for, in 
> all cases, I want the second one, but the first will be picked.}}
>  
> {color:#a31515}{ "type": "record"{color}
> {color:#a31515}, "name": "demo1"{color}
> {color:#a31515}, "fields" :{color}
> {color:#a31515} [ { "name": "enum"{color}
> {color:#a31515} , "type":{color}
> {color:#a31515} [ \{ "type": "enum", "name": "suit", "symbols": [ "diamonds", 
> "hearts", "clubs", "spades" ] }{color}
> {color:#a31515} , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", 
> "spades", "hammers", "drills" ] }{color}
> {color:#a31515} ]{color}
> {color:#a31515} , "doc": "I want tools to be the default, but if I select 
> spades we get a suit!"{color}
> {color:#a31515} , "default": "spades"{color}
> {color:#a31515} }{color}
> {color:#a31515} , { "name": "string_enum"{color}
> {color:#a31515} , "type":{color}
> {color:#a31515} [ \{ "type": "string" }{color}
> {color:#a31515} , \{ "type": "enum", "name": "tools", "symbols": [ "shovels", 
> "spades", "hammers", "drills" ] }{color}
> {color:#a31515} ]{color}
> {color:#a31515} , "doc": "I want tools to be the default, but this will be a 
> string"{color}
> {color:#a31515} , "default": "drill"{color}
> {color:#a31515} }{color}
> {color:#a31515} , { "name": "int_long"{color}
> {color:#a31515} , "type": [ "int", "long"]{color}
> {color:#a31515} , "doc": "I can't make this a long"{color}
> {color:#a31515} , "default": 400000000{color}
> {color:#a31515} }{color}
> {color:#a31515} , { "name": "records_ambiguity"{color}
> {color:#a31515} , "type":{color}
> {color:#a31515} [ { "name": "xs"{color}
> {color:#a31515} , "type":{color}
> {color:#a31515} { "type": "record"{color}
> {color:#a31515} , "name": "xs"{color}
> {color:#a31515} , "fields" : [ \{ "name": "z", "type": [ "null", "int" ], 
> "default": null }, \{ "name": "w", "type": [ "null", "int" ], "default": null 
> } ]{color}
> {color:#a31515} }{color}
> {color:#a31515} }{color}
> {color:#a31515} , { "name": "ys"{color}
> {color:#a31515} , "type":{color}
> {color:#a31515} { "type": "record"{color}
> {color:#a31515} , "name": "ys"{color}
> {color:#a31515} , "fields" : [ \{ "name": "a", "type": "int" } , \{ "name": 
> "c", "type": "string" } ]{color}
> {color:#a31515} }{color}
> {color:#a31515} }{color}
> {color:#a31515} ]{color}
> {color:#a31515} , "doc": "I want ys, hence the fields a and c, xs has 
> everything with defaults, so it matches and the items are discarded."{color}
> {color:#a31515} , "default": \{ "a": 2, "c": "yes" }{color}
> {color:#a31515} }{color}
> {color:#a31515} ]{color}
> {color:#a31515}}{color}
>  
> {{Other things which it can get wrong are "fixed", fields which have names, 
> so two are fine, but they could be parsed as each other, "bytes", and so on.}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to