tustvold commented on issue #1932:
URL: https://github.com/apache/arrow-rs/issues/1932#issuecomment-1164881745

   Could you expand a bit on what the expected behaviour is, as honestly cannot 
find any comprehensive document on how this is supposed to be handled. It's one 
of the many data model mismatches between arrow and parquet where it isn't 
really very clearly defined what is "correct" - #1666. 
   
   Ultimately Parquet does not have a native mechanism to encode timezone 
information in its schema, instead opting for something slightly different - 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp. 
The arrow schema is embedded in the parquet file, but as documented in #1663 
this cannot be treated as authoritative.
   
   What I can say is the following:
   
   * The timezone is being stored in the embedded schema
   * As of parquet 15.0.0, in particular 
https://github.com/apache/arrow-rs/pull/1682, parquet-rs roundtrips timezones 
correctly
   * pqrs is on parquet 12.0.0 where timezones did not roundtrip correctly
   * pyarrow appears to ignore the timezone stored within the arrow schema
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to