[
https://issues.apache.org/jira/browse/HIVE-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Swarnim Kulkarni updated HIVE-10687:
------------------------------------
Description:
Consider the union field:
{noformat}
union {int, string}
{noformat}
and now this field evolves to
{noformat}
union {null, int, string}.
{noformat}
Running it through the avro schema compatibility check[1], they are actually
compatible which means that the latter could be used to deserialize the data
written with former. However the avro deserializer fails to do that. Mainly
because of the way it reads the tags from the reader schema and then reds the
corresponding data from the writer schema. [2]
[1] http://pastebin.cerner.corp/31078
[2]
https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354
was:
Consider the union field:
union {int, string}
and now this field evolves to
union {null, int, string}.
Running it through the avro schema compatibility check[1], they are actually
compatible which means that the latter could be used to deserialize the data
written with former. However the avro deserializer fails to do that. Mainly
because of the way it reads the tags from the reader schema and then reds the
corresponding data from the writer schema. [2]
[1] http://pastebin.cerner.corp/31078
[2]
https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354
> AvroDeserializer fails to deserialize evolved union fields
> ----------------------------------------------------------
>
> Key: HIVE-10687
> URL: https://issues.apache.org/jira/browse/HIVE-10687
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Swarnim Kulkarni
> Assignee: Swarnim Kulkarni
>
> Consider the union field:
> {noformat}
> union {int, string}
> {noformat}
> and now this field evolves to
> {noformat}
> union {null, int, string}.
> {noformat}
> Running it through the avro schema compatibility check[1], they are actually
> compatible which means that the latter could be used to deserialize the data
> written with former. However the avro deserializer fails to do that. Mainly
> because of the way it reads the tags from the reader schema and then reds the
> corresponding data from the writer schema. [2]
> [1] http://pastebin.cerner.corp/31078
> [2]
> https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)