[
https://issues.apache.org/jira/browse/PARQUET-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425759#comment-16425759
]
ASF GitHub Bot commented on PARQUET-1253:
-----------------------------------------
nandorKollar commented on a change in pull request #463: PARQUET-1253: Support
for new logical type representation
URL: https://github.com/apache/parquet-mr/pull/463#discussion_r179192675
##########
File path:
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/metadata/ParquetMetadata.java
##########
@@ -41,6 +40,10 @@
private static final ObjectMapper objectMapper = new ObjectMapper();
+ static {
+ objectMapper.configure(SerializationConfig.Feature.FAIL_ON_EMPTY_BEANS,
false);
Review comment:
Without this configuration change on the objectMapper three tests in
parquet-cascading3 fail with the following error:
testReadPattern(org.apache.parquet.cascading.TestParquetTupleScheme):
[namecp] could not build flow from assembly: [java.io.IOException: Could not
read footer: java.lang.RuntimeException:
org.codehaus.jackson.map.JsonMappingException: No serializer found for class
org.apache.parquet.schema.LogicalTypeAnnotation$StringLogicalTypeAnnotation and
no properties discovered to create BeanSerializer (to avoid exception, disable
SerializationConfig.Feature.FAIL_ON_EMPTY_BEANS) ) (through reference chain:
org.apache.parquet.hadoop.metadata.ParquetMetadata["fileMetaData"]->org.apache.parquet.hadoop.metadata.FileMetaData["schema"]->org.apache.parquet.schema.MessageType["fields"]->java.util.ArrayList[0]->org.apache.parquet.schema.PrimitiveType["logicalTypeAnnotation"])]
The reason is: the logical types are no longer represented as enum, but a
classed, and Jackson can't serialize empty classes because FAIL_ON_EMPTY_BEANS
feature is enabled by default.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Support for new logical type representation
> -------------------------------------------
>
> Key: PARQUET-1253
> URL: https://issues.apache.org/jira/browse/PARQUET-1253
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Nandor Kollar
> Assignee: Nandor Kollar
> Priority: Major
>
> Latest parquet-format
> [introduced|https://github.com/apache/parquet-format/commit/863875e0be3237c6aa4ed71733d54c91a51deabe#diff-0f9d1b5347959e15259da7ba8f4b6252]
> a new representation for logical types. As of now this is not yet supported
> in parquet-mr, thus there's no way to use parametrized UTC normalized
> timestamp data types. When reading and writing Parquet files, besides
> 'converted_type' parquet-mr should use the new 'logicalType' field in
> SchemaElement to tell the current logical type annotation. To maintain
> backward compatibility, the semantic of converted_type shouldn't change.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)