jubins opened a new pull request, #56849: URL: https://github.com/apache/spark/pull/56849
### What is the purpose of the change Fixes SPARK-57738 — restores the fast-fail guard for nanosecond-precision timestamp types in `ArrowVectorReader`, which was silently broken by SPARK-57303. SPARK-57303 updated `UpCastRule.canUpCast` to return `true` for lossless widening within the timestamp family (e.g. `TimestampType -> TimestampLTZNanosType(p)`). As a side effect, the existing unsupported-type guard in `ArrowVectorReader.applyDefault` no longer rejects nanosecond timestamp targets — the SPARK-57303 commit message explicitly flagged this as a known follow-up item. Without this fix, a request to read a `TIMESTAMP_LTZ(p)` or `TIMESTAMP_NTZ(p)` (`p` in `[7, 9]`) column over Spark Connect silently passes the guard and then crashes with a confusing `"Unsupported Vector Type"` error from the catch-all branch of the `vector match`. With this fix it fails fast with a clear `"not yet supported"` message. ### Brief change log - `sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala`: added `AnyTimestampNanoType` to the import and inserted an explicit rejection guard between the `canUpCast` check and the `vector match` block ### Verifying this change No existing unit tests cover `ArrowVectorReader` directly. The fix is a defensive guard on an unsupported code path (nanosecond-precision timestamps are not yet reachable over Connect in any supported workflow), so the primary verification is: - Manual inspection: the guard fires before the `vector match`, so no nanosecond type can reach the `"Unsupported Vector Type"` catch-all - The fix will be superseded and removed when Connect nanos support is implemented (the comment in the code points to this) ### Does this pull request potentially affect one of the following parts - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public`/`@Evolving`: no — `ArrowVectorReader` is `private[connect]` - The serializers: no - The runtime per-record code paths (performance sensitive): no — the guard only fires for an unsupported type that cannot currently be produced - Anything that affects deployment or recovery: no - The S3 file system connector: no ### Documentation Does this pull request introduce a new feature? No — this is a bug fix restoring a guard that was inadvertently disabled by SPARK-57303. ### Was generative AI tooling used to co-author this PR? - [x] Yes — Claude Code was used as a pair-programming assistant. All code was written, understood, and verified by the author. Generated-by: Claude Opus 4.8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
