MaxGekk opened a new pull request, #56392:
URL: https://github.com/apache/spark/pull/56392

   ### What changes were proposed in this pull request?
   
   This PR fixes `Row.json` / `Row.prettyJson` for `TIME` (`TimeType`) columns 
by rendering the row's external value through the Types Framework's 
`formatExternal` instead of the internal-value `format`.
   
   - `Row.jsonValue`'s `toJson` now renders a framework type's value via 
`TypeApiOps(dt).formatExternal(value)`. A framework type without an external 
formatter falls back to `format(value)` (so the nanosecond timestamp types keep 
their existing behavior), and non-framework types fall back to the legacy 
`toJsonDefault`.
   - `TimeTypeApiOps.formatExternal` now formats the external 
`java.time.LocalTime` directly through the time formatter. It previously 
converted to nanos-of-day first; that was an identity round-trip 
(`localTimeToNanos` then `nanosToLocalTime`), so the rendered output is 
unchanged.
   
   Root cause: a public `Row` holds *external* values (e.g. 
`java.time.LocalTime` for `TimeType`), but `toJson` called the internal-value 
`format`, which does `value.asInstanceOf[Long]` and threw `ClassCastException`.
   
   ### Why are the changes needed?
   
   With the Types Framework enabled, serializing a row that has a `TIME` column 
to JSON fails:
   
   ```scala
   import java.time.LocalTime
   import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
   import org.apache.spark.sql.types._
   
   val row = new GenericRowWithSchema(
     Array(LocalTime.of(12, 13, 14)),
     new StructType().add("a", TimeType()))
   
   row.json
   ```
   
   throws:
   
   ```
   java.lang.ClassCastException: class java.time.LocalTime cannot be cast to 
class java.lang.Long
   ```
   
   because the JSON path passed the external `LocalTime` to the internal-value 
`format`. The framework already exposes `formatExternal` (the external-value 
entry point, as `HiveResult` uses); Row JSON should use it.
   
   This also closes the coverage gap noted in #56355: `Row.jsonValue` was not 
exercised for framework types.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. With the Types Framework enabled, `Row.json` / `Row.prettyJson` on a 
`TIME` column previously threw `ClassCastException`; it now renders the value, 
e.g.:
   
   ```
   {"a":"12:13:14"}
   ```
   
   The nanosecond timestamp types are unaffected: they have no external 
formatter yet, so Row JSON falls back to `format` and keeps raising 
`UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING`.
   
   ### How was this patch tested?
   
   Added `RowJsonSuite` tests (tagged `SPARK-57338`):
   - `TIME` column rendering: plain, fractional (microsecond resolution, 
trailing zeros trimmed), and midnight.
   - Rendering is independent of the `TimeType` precision.
   - A guard that nanosecond timestamp columns still raise 
`UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING` in Row JSON.
   
   I also confirmed the new `TIME` test fails on the pre-fix code with the 
expected `ClassCastException`, and `./dev/scalastyle` passes.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Cursor (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to