[ 
https://issues.apache.org/jira/browse/AVRO-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471575#comment-13471575
 ] 

Sean Busbey commented on AVRO-997:
----------------------------------

I just ran into this as part of Hive's integration with Avro 1.7.1 
([HIVE-3538|https://issues.apache.org/jira/browse/HIVE-3538]). AFAICT, Hive 
only makes use of the Generic API.

GenericData.validate returns true for a record with an Avro enum or a union 
that contains an Avro enum so long as the result of datum.toString is in the 
set of elements. Actually serializing works fine for arbitrary incoming values 
when the field is an Avro enum, but fails in the union case if the value isn't 
GenericEnumSymbol. In addition to Java enums, this seems likely to come up for 
GenericData users when they attempt to use Strings.

This seems like a bug for GenericData. I could provide a patch that makes 
GenericDatumWriter.write consistent with GenericData.validate for the union 
case, if there's interest. Alternatively, I could provide one that causes the 
validate and write calls to be stricter wrt to the plain enum case, which I 
think would help avoid user confusion if enums are only supposed to be 
GenericEnumSybmol.

Thoughts?
                
> Union of enum and null cannot be serialized
> -------------------------------------------
>
>                 Key: AVRO-997
>                 URL: https://issues.apache.org/jira/browse/AVRO-997
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.5.1
>            Reporter: Aaron Kimball
>
> I have a schema like:
> {code}
> [
> {
>   "type": "enum",
>   "name": "Gender",
>   "symbols": ["M", "F"]
> },
> {
>   "type" : "record",
>   "name" : "Foo",
>   "fields" : [
>     { "type" : ["Gender", "null"], "name" : "gender" },
>     ...
>   ]
> }
> ]
> {code}
> I build a record like {{Foo foo = new Foo(); foo.gender = Gender.M;}}
> When I go to serialize this, I get:
> {code}Not in union 
> [{"type":"enum","name":"Gender","symbols":["M","F"]},"null"]: M
>       at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:482)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to