Igor Bernstein created PARQUET-557:
--------------------------------------

             Summary: Enums are incorrectly handled by parquet-avro when using 
GenericRecords
                 Key: PARQUET-557
                 URL: https://issues.apache.org/jira/browse/PARQUET-557
             Project: Parquet
          Issue Type: Bug
            Reporter: Igor Bernstein
            Priority: Minor


It appears that enums are handled incorrectly when reading parquet as generic 
records.

Looking at the code:
https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroIndexedRecordConverter.java#L236-L238

FieldEnumConverter falls back to a string representation when it can't find the 
corresponding enum class.  This is problematic when trying to read parquet 
files generically without specific records on the classpath because the records 
will no longer match the schema. I believe a more correct approach would be to 
wrap the enums in GenericData.EnumSymbol:
https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/generic/GenericData.java#L397



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to