MaxGekk opened a new pull request, #56392:
URL: https://github.com/apache/spark/pull/56392
### What changes were proposed in this pull request?
This PR fixes `Row.json` / `Row.prettyJson` for `TIME` (`TimeType`) columns
by rendering the row's external value through the Types Framework's
`formatExternal` instead of the internal-value `format`.
- `Row.jsonValue`'s `toJson` now renders a framework type's value via
`TypeApiOps(dt).formatExternal(value)`. A framework type without an external
formatter falls back to `format(value)` (so the nanosecond timestamp types keep
their existing behavior), and non-framework types fall back to the legacy
`toJsonDefault`.
- `TimeTypeApiOps.formatExternal` now formats the external
`java.time.LocalTime` directly through the time formatter. It previously
converted to nanos-of-day first; that was an identity round-trip
(`localTimeToNanos` then `nanosToLocalTime`), so the rendered output is
unchanged.
Root cause: a public `Row` holds *external* values (e.g.
`java.time.LocalTime` for `TimeType`), but `toJson` called the internal-value
`format`, which does `value.asInstanceOf[Long]` and threw `ClassCastException`.
### Why are the changes needed?
With the Types Framework enabled, serializing a row that has a `TIME` column
to JSON fails:
```scala
import java.time.LocalTime
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
import org.apache.spark.sql.types._
val row = new GenericRowWithSchema(
Array(LocalTime.of(12, 13, 14)),
new StructType().add("a", TimeType()))
row.json
```
throws:
```
java.lang.ClassCastException: class java.time.LocalTime cannot be cast to
class java.lang.Long
```
because the JSON path passed the external `LocalTime` to the internal-value
`format`. The framework already exposes `formatExternal` (the external-value
entry point, as `HiveResult` uses); Row JSON should use it.
This also closes the coverage gap noted in #56355: `Row.jsonValue` was not
exercised for framework types.
### Does this PR introduce _any_ user-facing change?
Yes. With the Types Framework enabled, `Row.json` / `Row.prettyJson` on a
`TIME` column previously threw `ClassCastException`; it now renders the value,
e.g.:
```
{"a":"12:13:14"}
```
The nanosecond timestamp types are unaffected: they have no external
formatter yet, so Row JSON falls back to `format` and keeps raising
`UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING`.
### How was this patch tested?
Added `RowJsonSuite` tests (tagged `SPARK-57338`):
- `TIME` column rendering: plain, fractional (microsecond resolution,
trailing zeros trimmed), and midnight.
- Rendering is independent of the `TimeType` precision.
- A guard that nanosecond timestamp columns still raise
`UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING` in Row JSON.
I also confirmed the new `TIME` test fails on the pre-fix code with the
expected `ClassCastException`, and `./dev/scalastyle` passes.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor (Claude Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]