MaxGekk commented on code in PR #56849:
URL: https://github.com/apache/spark/pull/56849#discussion_r3506008185
##########
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala:
##########
@@ -84,6 +84,15 @@ object ArrowVectorReader {
throw new RuntimeException(
s"Reading '$targetDataType' values from a ${vector.getClass} instance
is not supported.")
}
+ // Nanosecond-precision timestamp types (TIMESTAMP_LTZ(p) /
TIMESTAMP_NTZ(p), p in [7,9]) are
+ // not yet supported over Spark Connect: there is no Arrow vector type for
sub-microsecond
+ // timestamps and no reader implementation here. UpCastRule.canUpCast now
returns true for the
+ // micro -> nanos widening direction (SPARK-57303), so the generic guard
above no longer
+ // catches this case. Fail fast with a clear message until Connect nanos
support is added.
+ if (targetDataType.isInstanceOf[AnyTimestampNanoType]) {
Review Comment:
This guard is premised on nanos timestamps being unsupported, but `master`
already maps `TimestampNTZNanosType`/`TimestampLTZNanosType` to/from Arrow
`NANOSECOND` timestamps in `ArrowUtils`, and [SPARK-57160] added the Connect
proto messages for them. It also sits in `applyDefault`, which
`ConnectTypeOps.forDataType(...)` is meant to bypass (the Types Framework flag
was removed in [SPARK-57372]). Prefer adding a `ConnectTypeOps` for these types
with `createArrowVectorReader` — like `TimeTypeConnectOps` does for `TIME` —
rather than a blanket `throw` here that will shadow / go dead as nanos read
support lands.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]