gengliangwang commented on code in PR #45571:
URL: https://github.com/apache/spark/pull/45571#discussion_r1530949174
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala:
##########
@@ -436,6 +436,17 @@ private[parquet] class ParquetRowConverter(
}
}
+ // INT96 timestamp doesn't have a logical type, here we check the
physical type instead.
+ case TimestampNTZType if
parquetType.asPrimitiveType().getPrimitiveTypeName == INT96 =>
+ new ParquetPrimitiveConverter(updater) {
+ // Converts nanosecond timestamps stored as INT96.
+ // TimestampNTZ type does not require rebasing due to its lack of
time zone context.
Review Comment:
* LTZ doesn't really store the time zone info in Parquet files. Also, Spark
uses the long value directly when reading NTZ as LTZ. I am trying to make it
simple and Symmetrical.
* If we shift by sessional time zone here, probably we need to do it when
reading NTZ as LTZ, which will be a breaking change. Also, the result of NTZ
columns will be affected by the sessional time zone conf
`spark.sql.session.timeZone`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]