[ 
https://issues.apache.org/jira/browse/HBASE-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930336#comment-15930336
 ] 

Jerry He commented on HBASE-16247:
----------------------------------

Just want to clarify a little more.  Hope I am right.  As soon as the hbase 
spark data source implementation sees the Avro schema as a column, it converts 
the Avro types to Spark SQL types. ENUM is mapped to StringType. After that, 
ENUM will just disappear throughout the catalyst data flow, including 
serialization or deserialization in the hbase spark data source.

> SparkSQL Avro serialization doesn't handle enums correctly
> ----------------------------------------------------------
>
>                 Key: HBASE-16247
>                 URL: https://issues.apache.org/jira/browse/HBASE-16247
>             Project: HBase
>          Issue Type: Bug
>          Components: spark
>    Affects Versions: 2.0.0
>            Reporter: Sean Busbey
>            Priority: Critical
>             Fix For: 2.0.0
>
>
> Avro's generic api expects GenericEnumSymbol as the runtime type for 
> instances of fields that are of Avro type ENUM. The Avro 1.7 libraries are 
> lax in some cases for handling this, but the 1.8 libraries are strict. We 
> should proactively fix our serialization.
> (the lax serialization in 1.7 fails for some nested use in unions, see 
> AVRO-997 for details)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to