bersprockets commented on pull request #34712:
URL: https://github.com/apache/spark/pull/34712#issuecomment-980471687
Unfortunately, I don't think you can employ useUTCTimestamp=true to fix
TIMESTAMP_NTZ without breaking compatibility for TIMESTAMP between versions. I
could be wrong.
The check for isOldOrcFile works fine, but only one way (reading old Spark
from new Spark). The other way it does not work, for example:
Write using Spark with this PR (running in a non-UTC timezone)
```
sql("select timestamp '2021-06-01 00:00:00'
ts").write.mode("overwrite").format("orc").save("/tmp/testdata/ts_orc_spark_use_utc")
```
Read using Spark 3.2.0 (running in the same timezone as above):
```
scala> sql("select * from
`orc`.`/tmp/testdata/ts_orc_spark_use_utc`").show(false)
+-------------------+
|ts |
+-------------------+
|2021-05-31 22:00:00|
+-------------------+
scala>
```
That's why my [POC
change](https://github.com/apache/spark/compare/master...bersprockets:orc_ntz_issue_play)
did some whacky looking stuff.
Even if it didn't break compatibility, it would be a behavior change between
minor versions. I would think such a behavior change would need a config to
toggle it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]