uros-b commented on code in PR #56407:
URL: https://github.com/apache/spark/pull/56407#discussion_r3419606175
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetWriteSupport.scala:
##########
@@ -268,6 +281,18 @@ class ParquetWriteSupport extends
WriteSupport[InternalRow] with Logging {
// MICROS time unit.
(row: SpecializedGetters, ordinal: Int) =>
recordConsumer.addLong(row.getLong(ordinal))
+ // TIMESTAMP(NANOS) values are always proleptic Gregorian and are exempt
from datetime
+ // rebasing; see the TIMESTAMP(NANOS) converters in
`ParquetRowConverter` for details.
Review Comment:
A bit of a nit: both converter cases are guarded on the Parquet annotation
being TIMESTAMP(NANOS). If a user supplies an explicit read schema with a nanos
type over a column whose Parquet annotation is not NANOS, both guards fail and
the match falls through to the generic handling.
Let's just confirm that this produces a clear error rather than a confusing
one. Schema clipping should normally prevent the situation, but a quick check
(or an explicit unguarded case _: TimestampLTZNanosType => that throws a
descriptive error) would make the contract explicit.
Please see how other similar types work, and let's consider whether we need
to take care of this or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]