MaxGekk commented on code in PR #56392:
URL: https://github.com/apache/spark/pull/56392#discussion_r3383760202
##########
sql/api/src/main/scala/org/apache/spark/sql/Row.scala:
##########
@@ -632,8 +632,14 @@ trait Row extends Serializable {
if (value == null) {
JNull
} else {
+ // A public Row holds external values (e.g. java.time.LocalTime for
TimeType), so render
+ // through the framework's formatExternal rather than format, which
expects the internal
+ // representation and would otherwise fail the value cast. A framework
type either returns a
+ // rendered string or raises its own error (e.g. the nanosecond
timestamp types raise the
+ // unsupported-rendering error); types outside the framework fall back
to legacy rendering.
TypeApiOps(dataType)
- .map(ops => JString(ops.format(value)))
+ .flatMap(_.formatExternal(value))
Review Comment:
Good catch - fixed in b5f8de1. The framework-off (production-default) path
now renders the external `LocalTime` via a new `case (lt: LocalTime, _:
TimeType) => JString(timeFormatter.format(lt))` in `toJsonDefault`, mirroring
`HiveResult`'s legacy fallback, and I added a framework-off `RowJsonSuite` test
covering plain, fractional, and null TIME values.
On the `Long` case: I kept it for now since it's the internal representation
that non-`Row` callers might still pass, so dropping it felt out of scope here.
Happy to remove it (and the now-dead-under-framework path) in a follow-up if
you'd prefer.
##########
sql/api/src/main/scala/org/apache/spark/sql/types/ops/TimestampNanosTypeApiOps.scala:
##########
@@ -57,6 +57,19 @@ abstract class TimestampNanosTypeApiOps extends TypeApiOps
with DataTypeErrorsBa
override def format(v: Any): String =
throw DataTypeErrors.cannotConvertNanosTimestampToStringError(dataType)
+ // Row JSON (Row.json / Row.prettyJson) holds the external Row value
(java.time.Instant for LTZ,
+ // java.time.LocalDateTime for NTZ), but rendering nanosecond timestamps
from this zone-less path
+ // is not supported yet (same reason as format above), so raise the
user-facing
+ // UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING error here rather than
silently truncating.
+ override def formatExternal(value: Any): Option[String] =
+ throw DataTypeErrors.cannotConvertNanosTimestampToStringError(dataType)
+
+ // The Hive result path (HiveResult.toHiveString) is zone-aware and renders
nanosecond timestamps
+ // through its own default formatter, so return None here to fall through to
it rather than raise
+ // the unsupported-rendering error that the zone-less single-arg
formatExternal above throws. This
+ // is a temporary override until nanos external rendering is unified across
the two paths.
+ override def formatExternal(value: Any, nested: Boolean): Option[String] =
None
Review Comment:
Done in 8b9df74. I updated the `TypeApiOps` trait scaladoc: renamed the
section to "External-Value Formatting", documented both consumers (single-arg =
Row JSON via `Row.json`/`Row.prettyJson`; two-arg = Hive output via
`HiveResult.toHiveString`), and spelled out the `Some`/`None`/throw contract -
including that returning `None` silently routes the type into the legacy
rendering path, and that an implementation may throw to signal an unsupported
zone-less rendering.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]