[
https://issues.apache.org/jira/browse/AVRO-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471575#comment-13471575
]
Sean Busbey commented on AVRO-997:
----------------------------------
I just ran into this as part of Hive's integration with Avro 1.7.1
([HIVE-3538|https://issues.apache.org/jira/browse/HIVE-3538]). AFAICT, Hive
only makes use of the Generic API.
GenericData.validate returns true for a record with an Avro enum or a union
that contains an Avro enum so long as the result of datum.toString is in the
set of elements. Actually serializing works fine for arbitrary incoming values
when the field is an Avro enum, but fails in the union case if the value isn't
GenericEnumSymbol. In addition to Java enums, this seems likely to come up for
GenericData users when they attempt to use Strings.
This seems like a bug for GenericData. I could provide a patch that makes
GenericDatumWriter.write consistent with GenericData.validate for the union
case, if there's interest. Alternatively, I could provide one that causes the
validate and write calls to be stricter wrt to the plain enum case, which I
think would help avoid user confusion if enums are only supposed to be
GenericEnumSybmol.
Thoughts?
> Union of enum and null cannot be serialized
> -------------------------------------------
>
> Key: AVRO-997
> URL: https://issues.apache.org/jira/browse/AVRO-997
> Project: Avro
> Issue Type: Bug
> Affects Versions: 1.5.1
> Reporter: Aaron Kimball
>
> I have a schema like:
> {code}
> [
> {
> "type": "enum",
> "name": "Gender",
> "symbols": ["M", "F"]
> },
> {
> "type" : "record",
> "name" : "Foo",
> "fields" : [
> { "type" : ["Gender", "null"], "name" : "gender" },
> ...
> ]
> }
> ]
> {code}
> I build a record like {{Foo foo = new Foo(); foo.gender = Gender.M;}}
> When I go to serialize this, I get:
> {code}Not in union
> [{"type":"enum","name":"Gender","symbols":["M","F"]},"null"]: M
> at
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:482)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:70)
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira