Claire McGinty created PARQUET-2468:
---------------------------------------
Summary: ParquetMetadata.toPrettyJSON throws exception on file
read when LOG.isDebugEnabled()
Key: PARQUET-2468
URL: https://issues.apache.org/jira/browse/PARQUET-2468
Project: Parquet
Issue Type: Bug
Reporter: Claire McGinty
Observed on latest 0.14.x commit, c241170d9bc2cd8415b04e06ecea40ed3d80f64d.
When debug logging is enabled, tests that instantiate a ParquetFileReader fail
with:
{code:java}
java.lang.RuntimeException:
com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No serializer
found for class
org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and
no properties discovered to create BeanSerializer (to avoid exception, disable
SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain:
org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"])
at
org.apache.parquet.hadoop.metadata.ParquetMetadata.toJSON(ParquetMetadata.java:68)
at
org.apache.parquet.hadoop.metadata.ParquetMetadata.toPrettyJSON(ParquetMetadata.java:48)
at
org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:1592)
at
org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:629)
at
org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:902)
at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:698)
at
org.apache.parquet.hadoop.ColumnIndexValidator.checkContractViolations(ColumnIndexValidator.java:556)
at
org.apache.parquet.statistics.TestColumnIndexes.testColumnIndexes(TestColumnIndexes.java:348)
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: No
serializer found for class
org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and
no properties discovered to create BeanSerializer (to avoid exception, disable
SerializationFeature.FAIL_ON_EMPTY_BEANS) (through reference chain:
org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[24]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"])
at
com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
at
com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1330)
at
com.fasterxml.jackson.databind.DatabindContext.reportBadDefinition(DatabindContext.java:414)
at
com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.failForEmpty(UnknownSerializer.java:53)
at
com.fasterxml.jackson.databind.ser.impl.UnknownSerializer.serialize(UnknownSerializer.java:30)
at
com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:732)
at
com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:770)
at
com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
{code}
(note, this seems to be happening when the schema *doesn't* contain a logical
type, which makes me suspect some Jackson configuration to handle null values
is needed?)
To repro, enable debug logging or just comment out `if (LOG.isDebugEnabled())`
in ParquetMetadataConverter, as I did here:
https://github.com/apache/parquet-mr/compare/master...clairemcginty:parquet-mr:repro-avro-metadata-print-bug?expand=1
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]