MaxGekk commented on code in PR #56392:
URL: https://github.com/apache/spark/pull/56392#discussion_r3383760202


##########
sql/api/src/main/scala/org/apache/spark/sql/Row.scala:
##########
@@ -632,8 +632,14 @@ trait Row extends Serializable {
       if (value == null) {
         JNull
       } else {
+        // A public Row holds external values (e.g. java.time.LocalTime for 
TimeType), so render
+        // through the framework's formatExternal rather than format, which 
expects the internal
+        // representation and would otherwise fail the value cast. A framework 
type either returns a
+        // rendered string or raises its own error (e.g. the nanosecond 
timestamp types raise the
+        // unsupported-rendering error); types outside the framework fall back 
to legacy rendering.
         TypeApiOps(dataType)
-          .map(ops => JString(ops.format(value)))
+          .flatMap(_.formatExternal(value))

Review Comment:
   Good catch - fixed in b5f8de1. The framework-off (production-default) path 
now renders the external `LocalTime` via a new `case (lt: LocalTime, _: 
TimeType) => JString(timeFormatter.format(lt))` in `toJsonDefault`, mirroring 
`HiveResult`'s legacy fallback, and I added a framework-off `RowJsonSuite` test 
covering plain, fractional, and null TIME values.
   
   On the `Long` case: I kept it for now since it's the internal representation 
that non-`Row` callers might still pass, so dropping it felt out of scope here. 
Happy to remove it (and the now-dead-under-framework path) in a follow-up if 
you'd prefer.



##########
sql/api/src/main/scala/org/apache/spark/sql/types/ops/TimestampNanosTypeApiOps.scala:
##########
@@ -57,6 +57,19 @@ abstract class TimestampNanosTypeApiOps extends TypeApiOps 
with DataTypeErrorsBa
   override def format(v: Any): String =
     throw DataTypeErrors.cannotConvertNanosTimestampToStringError(dataType)
 
+  // Row JSON (Row.json / Row.prettyJson) holds the external Row value 
(java.time.Instant for LTZ,
+  // java.time.LocalDateTime for NTZ), but rendering nanosecond timestamps 
from this zone-less path
+  // is not supported yet (same reason as format above), so raise the 
user-facing
+  // UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING error here rather than 
silently truncating.
+  override def formatExternal(value: Any): Option[String] =
+    throw DataTypeErrors.cannotConvertNanosTimestampToStringError(dataType)
+
+  // The Hive result path (HiveResult.toHiveString) is zone-aware and renders 
nanosecond timestamps
+  // through its own default formatter, so return None here to fall through to 
it rather than raise
+  // the unsupported-rendering error that the zone-less single-arg 
formatExternal above throws. This
+  // is a temporary override until nanos external rendering is unified across 
the two paths.
+  override def formatExternal(value: Any, nested: Boolean): Option[String] = 
None

Review Comment:
   Done in 8b9df74. I updated the `TypeApiOps` trait scaladoc: renamed the 
section to "External-Value Formatting", documented both consumers (single-arg = 
Row JSON via `Row.json`/`Row.prettyJson`; two-arg = Hive output via 
`HiveResult.toHiveString`), and spelled out the `Some`/`None`/throw contract - 
including that returning `None` silently routes the type into the legacy 
rendering path, and that an implementation may throw to signal an unsupported 
zone-less rendering.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to