[ 
https://issues.apache.org/jira/browse/AVRO-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16703746#comment-16703746
 ] 

ASF subversion and git services commented on AVRO-2226:
-------------------------------------------------------

Commit 3e5cf4a94dd1f71e42a45c812f26a53c7b89e945 in avro's branch 
refs/heads/master from [~keatskelleher]
[ https://gitbox.apache.org/repos/asf?p=avro.git;h=3e5cf4a ]

[AVRO-2226] Fixes UnionSchema specificity

Trouble arises in the python library when deducing the appropriate schema from 
a list of schemas given a particular datum.

When "null" values are allowed for fields in two separate schemas, there is no 
way to differentiate which should schema be used given of a list of schemas 
that are set as the type definition for a record.

This PR checks to ensure all fields defined on a given datum are _also_ defined 
in the schema being validated to use for that datum.

With this bugfix, datums such as `{"foo": "a"}` will not "cast" to schemas such 
as `{"name": "bar", "type": ["long", "null"]}`, which is currently the case.


> UnionSchema deduction is too permissive
> ---------------------------------------
>
>                 Key: AVRO-2226
>                 URL: https://issues.apache.org/jira/browse/AVRO-2226
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.8.2
>            Reporter: Andrew Kelleher
>            Priority: Major
>         Attachments: AVRO-2226.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When given a schema of the form
> {code:java}
> {
>   "type" : "record",
>   "name" : "A",
>   "namespace" : "com.example",
>   "fields" : [
>     {
>       "name" : "foo",
>       "type" : ["string", "null"]
>     }
>   ]
> }
> {
>   "type" : "record",
>   "name" : "B",
>   "namespace" : "com.example",
>   "fields" : [
>     {
>       "name" : "bar",
>       "type" : ["string", "null"]
>     }
>   ]
> }
> {
>   "type" : "record",
>   "name" : "AOrB",
>   "namespace" : "com.example",
>   "fields" : [
>     {
>       "name" : "entity",
>       "type" : [
>         "com.example.A",
>         "com.example.B"
>       ]
>     }
>   ]
> }
> {code}
> And a datum of the form
> {code}
> {'entity': {'foo': 'this is an instance of schema A'}}{code}
> Converting to a message, and then from a message chooses the incorrect 
> `entity` schema:
> {code}
> {'entity': {'bar': None}}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to