[ 
https://issues.apache.org/jira/browse/HIVE-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni updated HIVE-10687:
------------------------------------
    Description: 
Consider the union field:

{noformat}
union {int, string}
{noformat}

and now this field evolves to

{noformat}
union {null, int, string}.
{noformat}

Running it through the avro schema compatibility check[1], they are actually 
compatible which means that the latter could be used to deserialize the data 
written with former. However the avro deserializer fails to do that. Mainly 
because of the way it reads the tags from the reader schema and then reds the 
corresponding data from the writer schema. [2]

[1] http://pastebin.cerner.corp/31078
[2] 
https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354

  was:
Consider the union field:

union {int, string}

and now this field evolves to

union {null, int, string}.

Running it through the avro schema compatibility check[1], they are actually 
compatible which means that the latter could be used to deserialize the data 
written with former. However the avro deserializer fails to do that. Mainly 
because of the way it reads the tags from the reader schema and then reds the 
corresponding data from the writer schema. [2]

[1] http://pastebin.cerner.corp/31078
[2] 
https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354


> AvroDeserializer fails to deserialize evolved union fields
> ----------------------------------------------------------
>
>                 Key: HIVE-10687
>                 URL: https://issues.apache.org/jira/browse/HIVE-10687
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Swarnim Kulkarni
>            Assignee: Swarnim Kulkarni
>
> Consider the union field:
> {noformat}
> union {int, string}
> {noformat}
> and now this field evolves to
> {noformat}
> union {null, int, string}.
> {noformat}
> Running it through the avro schema compatibility check[1], they are actually 
> compatible which means that the latter could be used to deserialize the data 
> written with former. However the avro deserializer fails to do that. Mainly 
> because of the way it reads the tags from the reader schema and then reds the 
> corresponding data from the writer schema. [2]
> [1] http://pastebin.cerner.corp/31078
> [2] 
> https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to