nandorKollar commented on issue #23721: [SPARK-26797][SQL][WIP] Start using the new logical types API of Parquet 1.11.0 instead of the deprecated one URL: https://github.com/apache/spark/pull/23721#issuecomment-461043480 @squito parquet-mr 1.11.0 writes both the old and the new logical types (converted_type and logicalType) in the Thrift schema, so old readers (who know only about converted_type) are able to read the annotation as long as there's a corresponding logicalType for the converted_type. Parquet-mr handles this conversion internally. For all legacy converted_type there's a corresponding logicalType, but since converted_type are deprecated, newly introduce logicalTypes might not have corresponding converted_type (for example timestamp with nano precision doesn't have any corresponding converted_type). In this case old readers will just see the physical type. As of reading old files where new logical types are not present in the schema, only converted_type is taken into account, and parquet-mr takes care of the conversion to logical type representation internally. The conversion rules between original_types and logicalTypes are documented in [parquet-format](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md). Does this answer your question?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
