cloud-fan commented on a change in pull request #34741: URL: https://github.com/apache/spark/pull/34741#discussion_r758910943
########## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala ########## @@ -531,13 +533,16 @@ object OrcUtils extends Logging { } def fromOrcNTZ(ts: Timestamp): Long = { - DateTimeUtils.millisToMicros(ts.getTime) + + val utcMicros = DateTimeUtils.millisToMicros(ts.getTime) + (ts.getNanos / NANOS_PER_MICROS) % MICROS_PER_MILLIS + val micros = DateTimeUtils.fromUTCTime(utcMicros, TimeZone.getDefault.getID) + micros } def toOrcNTZ(micros: Long): OrcTimestamp = { - val seconds = Math.floorDiv(micros, MICROS_PER_SECOND) - val nanos = (micros - seconds * MICROS_PER_SECOND) * NANOS_PER_MICROS + val utcMicros = DateTimeUtils.toUTCTime(micros, TimeZone.getDefault.getID) Review comment: I'm trying to understand this issue better. From the ORC source code, seems like 1. ORC writer shifts the timestamp value w.r.t. the JVM local timezone, and record the timezone in file footer 2. ORC reader shifts the timestamp value w.r.t. both the JVM local timezone and the record writer timezone. seems like we only need to change the ORC reader to shift the timestamp value by writer timezone? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org