[ 
https://issues.apache.org/jira/browse/AVRO-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006466#comment-14006466
 ] 

Douglas Kaminsky commented on AVRO-973:
---------------------------------------

-1 Steven Willis

We simulate inheritance by placing our objects in increasing complexity order 
within the union - arbitrarily switching to the first matching element breaks 
the workarounds we have put in place for this bug. Would rather not trade one 
bug for another. For primitives, would rather simply document that order is 
increasing priority.

For records, simplest proposed solution is to create a wrapper type that takes 
(a protocol and a type name) or a record schema as its __init__ argument(s) - 
this was sufficient to solve the problem on our end. It raises some issues, e.g.

* Validation now needs to accept EITHER a dict or this new wrapper type
* Failure to wrap an inner record (at depth > 1 within the record you're 
serializing) silently reproduces the bug

Unable to submit patch at this time due to proprietary code concerns (don't 
have time to strip down for submission).

> Union behavior not consistent
> -----------------------------
>
>                 Key: AVRO-973
>                 URL: https://issues.apache.org/jira/browse/AVRO-973
>             Project: Avro
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 1.6.1, 1.6.2
>            Reporter: Gaurav Nanda
>              Labels: patch
>         Attachments: AVRO-973-patch-1.patch, AVRO-973-patch-2.patch, 
> AVRO-973-patch-3.patch, AVRO-973-wrapper.patch, AVRO-973-wrapper.patch, 
> test_unions.py
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Python's union does not respect the order in which type is specified.
> For following schema: 
> {"type":"map","values":["int","long","float","double","string","boolean"]}, 
> an integer value is written as double, but it should respect the order in 
> which types have been specified.
> Fixed Code (io.py):
> def write_union(self, writers_schema, datum, encoder):
>    """
>    A union is encoded by first writing a long value indicating
>    the zero-based position within the union of the schema of its value.
>    The value is then encoded per the indicated schema within the union.
>    """
>    # resolve union
>    index_of_schema = -1
>    for i, candidate_schema in enumerate(writers_schema.schemas):
>      if validate(candidate_schema, datum):
>        index_of_schema = i
>        break // XXX Add break statement here XXX//
>    if index_of_schema < 0: raise AvroTypeException(writers_schema, datum)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to