beliefer opened a new pull request #34769:
URL: https://github.com/apache/spark/pull/34769


   ### What changes were proposed in this pull request?
   This PR used to fix the issue
   https://github.com/apache/spark/pull/33588#issuecomment-978719988
   
   The root cause is Orc write/read timestamp with local timezone in default. 
The local timezone will be changed.
   If the Orc writer write timestamp with local timezone(e.g. 
America/Los_Angeles), when the Orc reader reading the timestamp with local 
timezone(e.g. Europe/Amsterdam), the value of timestamp will be different.
   
   If we let the Orc writer write timestamp with UTC timezone, when the Orc 
reader reading the timestamp with  UTC timezone too, the value of timestamp 
will be correct.
   
   This PR let Orc write/read Timestamp with UTC timezone by call 
`useUTCTimestamp(true)` for readers or writers.
   
   The related Orc source:
   
https://github.com/apache/orc/blob/3f1e57cf1cebe58027c1bd48c09eef4e9717a9e3/java/core/src/java/org/apache/orc/impl/WriterImpl.java#L525
   
   
https://github.com/apache/orc/blob/1f68ac0c7f2ae804b374500dcf1b4d7abe30ffeb/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java#L1184
   
   Another problem is Spark 3.3 or newer read the Orc file written by Spark 3.2 
or prior. Because the older Spark write timestamp with local timezone, no need 
to read them with UTC timezone. Otherwise, an incorrect value of timestamp 
occurs.
   
   ### Why are the changes needed?
   Fix the bug for Orc timestamp.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Orc timestamp ntz is a new feature not release yet.
   
   
   ### How was this patch tested?
   New tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to