[GitHub] [spark] MaxGekk opened a new pull request #34973: [WIP][SPARK-37705][SQL] Write the session time zone in Parquet file metadata

GitBox Tue, 21 Dec 2021 05:16:40 -0800


MaxGekk opened a new pull request #34973:
URL: https://github.com/apache/spark/pull/34973



   ### What changes were proposed in this pull request?
   In the PR, I propose to add new metadata key `org.apache.spark.timeZone` 
which Spark writes to parquet matadata while performing rebase of timestamps or 
dates.
   
   ### Why are the changes needed?
   Before the changes, Spark assumes that a writer uses the default JVM time 
zone while rebasing of dates/timestamps. And if a reader and the writer have 
different JVM time zone settings, the reader cannot load such columns 
correctly. So, the reader will have full info about writer settings after the 
changes.
   
   ### Does this PR introduce _any_ user-facing change?
   No, in the default case but behavior can be different when JVM time zone is 
different from the session time zone.
   
   ### How was this patch tested?
   By running new tests:
   ```
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] MaxGekk opened a new pull request #34973: [WIP][SPARK-37705][SQL] Write the session time zone in Parquet file metadata

Reply via email to