uros-b commented on code in PR #56739:
URL: https://github.com/apache/spark/pull/56739#discussion_r3467906042
##########
sql/api/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala:
##########
@@ -125,6 +130,24 @@ private[sql] object ArrowUtils {
}
}
+ /**
+ * Builds an Arrow field for a nanosecond timestamp type, stashing the
column precision in the
+ * field metadata (alongside the user metadata) so it can be recovered in
`fromArrowField`.
+ */
+ private def toTimestampNanosArrowField(
Review Comment:
FYI: a concurrently open PR (https://github.com/apache/spark/pull/56334)
adds def isSupportedByArrow(dt) enumerating temporal types as DateType |
TimestampType | TimestampNTZType | _: TimeType and OMITTING
TimestampNTZNanosType/TimestampLTZNanosType (falls to case _ => false). That
method gates the Arrow cache (schema.forall(attr =>
ArrowUtils.isSupportedByArrow(attr.dataType))).
Whichever order the two merge, Arrow-cache consumers will return false for
nanosecond-timestamp columns and silently exclude them from Arrow caching even
though this PR wires full Arrow support; a silent coverage gap.
This PR (or a stated follow-up on
[#56334](https://github.com/apache/spark/pull/56334)) should add the two nanos
cases to isSupportedByArrow. Not a defect in this PR's code; but a real
coordination item the author should resolve before/at merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]