cloud-fan commented on pull request #34741: URL: https://github.com/apache/spark/pull/34741#issuecomment-1001467362
@gengliangwang please read the previous discussions in previous PRs. We tried your proposals and none of them worked: 1. ORC stored timezone per "column chunk" not per file, and we can't read this timezone info in the row-based ORC reader to shift the timestamp values. 2. `useUTCTimestamp` is a global conf. If we set it, we break TIMESTAMP_LTZ. I don't think there is a better option. Phase 2 is not a breaking change if no one is using Spark version < 3.3. It may take years but it's still possible. Before that, we are still in a good shape, the only problem is other systems reading ORC files written by Spark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
