Igor Bernstein created PARQUET-557:
--------------------------------------
Summary: Enums are incorrectly handled by parquet-avro when using
GenericRecords
Key: PARQUET-557
URL: https://issues.apache.org/jira/browse/PARQUET-557
Project: Parquet
Issue Type: Bug
Reporter: Igor Bernstein
Priority: Minor
It appears that enums are handled incorrectly when reading parquet as generic
records.
Looking at the code:
https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroIndexedRecordConverter.java#L236-L238
FieldEnumConverter falls back to a string representation when it can't find the
corresponding enum class. This is problematic when trying to read parquet
files generically without specific records on the classpath because the records
will no longer match the schema. I believe a more correct approach would be to
wrap the enums in GenericData.EnumSymbol:
https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java#L397
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)