uros-b commented on code in PR #56739:
URL: https://github.com/apache/spark/pull/56739#discussion_r3467906042


##########
sql/api/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala:
##########
@@ -125,6 +130,24 @@ private[sql] object ArrowUtils {
     }
   }
 
+  /**
+   * Builds an Arrow field for a nanosecond timestamp type, stashing the 
column precision in the
+   * field metadata (alongside the user metadata) so it can be recovered in 
`fromArrowField`.
+   */
+  private def toTimestampNanosArrowField(

Review Comment:
   FYI: a concurrently open PR (https://github.com/apache/spark/pull/56334) 
adds def isSupportedByArrow(dt) enumerating temporal types as DateType | 
TimestampType | TimestampNTZType | _: TimeType and OMITTING 
TimestampNTZNanosType/TimestampLTZNanosType (falls to case _ => false). That 
method gates the Arrow cache (schema.forall(attr => 
ArrowUtils.isSupportedByArrow(attr.dataType))).
   
   Whichever order the two merge, Arrow-cache consumers will return false for 
nanosecond-timestamp columns and silently exclude them from Arrow caching even 
though this PR wires full Arrow support; a silent coverage gap.
   
   This PR (or a stated follow-up on 
[#56334](https://github.com/apache/spark/pull/56334)) should add the two nanos 
cases to isSupportedByArrow. Not a defect in this PR's code; but a real 
coordination item the author should resolve before/at merge.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to