nandorKollar commented on issue #23721: [SPARK-26797][SQL][WIP] Start using the 
new logical types API of Parquet 1.11.0 instead of the deprecated one
URL: https://github.com/apache/spark/pull/23721#issuecomment-461043480
 
 
   @squito parquet-mr 1.11.0 writes both the old and the new logical types 
(converted_type and logicalType) in the Thrift schema, so old readers (who know 
only about converted_type) are able to read the annotation as long as there's a 
corresponding logicalType for the converted_type. Parquet-mr handles this 
conversion internally. For all legacy converted_type there's a corresponding 
logicalType, but since converted_type are deprecated, newly introduce 
logicalTypes might not have corresponding converted_type (for example timestamp 
with nano precision doesn't have any corresponding converted_type). In this 
case old readers will just see the physical type.
   
   As of reading old files where new logical types are not present in the 
schema, only converted_type is taken into account, and parquet-mr takes care of 
the conversion to logical type representation internally. The conversion rules 
between original_types and logicalTypes are documented in 
[parquet-format](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md).
 Does this answer your question?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to