MaxGekk opened a new pull request, #56513:
URL: https://github.com/apache/spark/pull/56513

   ### What changes were proposed in this pull request?
   Unify nanosecond timestamp (`TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)`) 
rendering in `HiveResult` onto the Types Framework, removing the inline 
duplicate renderer.
   
   - `TimestampNanosTypeApiOps`: remove the `formatExternal(value, nested) = 
None` override so the Hive path shares each subclass's single-arg 
`formatExternal` renderer (the same one Row JSON uses). `nested` does not 
affect timestamp formatting.
   - `HiveResult.toHiveStringDefault`: remove the inline 
`TimestampLTZNanosType` / `TimestampNTZNanosType` cases. The legacy path keeps 
no nanos handling, so a nanos value that somehow reaches it (only possible with 
the framework off, which the gating forbids) is unsupported rather than 
silently rendered.
   - `TypeApiOps`: update the two-arg `formatExternal` scaladoc to reflect that 
Hive now shares the single-arg renderer.
   
   ### Why are the changes needed?
   The nanosecond timestamp types are a Types Framework feature, implemented 
solely through it. External-value rendering for the framework is centralized in 
`TypeApiOps.formatExternal`, which already backs Row JSON (`Row.json` / 
`Row.prettyJson`).
   
   The nanos ops previously overrode the two-arg `formatExternal(value, 
nested)` to return `None`, so `HiveResult` rendered nanos through inline 
pattern-matching in `toHiveStringDefault`. That duplicated the formatter logic 
and was documented in code as a temporary split "until nanos external rendering 
is unified across the zone-less (Row JSON) and zone-aware (Hive) paths".
   
   The types are gated by `timestampNanosTypesEnabled = 
timestampNanosTypes.enabled && types.framework.enabled`, so a nanos column 
cannot exist while the framework is off; the inline cases are therefore dead 
code in that mode and redundant when the framework is on.
   
   ### Does this PR introduce _any_ user-facing change?
   No. Nanos Hive output is identical: zone-aware LTZ, zone-independent NTZ, 
precision flooring, and trailing-zero trimming are all unchanged. Both the old 
inline path and the framework path ultimately call the same 
`TimestampFormatter` methods; the pre-flooring in the inline path is a no-op 
because values reaching Hive come from `executeCollectPublic()`, whose internal 
value is already stored floored to the column precision.
   
   ### How was this patch tested?
   Existing `HiveResultSuite` SPARK-57257 tests cover precision 7/8/9, pre-1970 
epochs, nested arrays/maps/structs, NULLs, and session-zone vs zone-independent 
rendering, and now exercise the framework path. Also ran 
`TimestampNanosTypeOpsSuite` and `RowJsonSuite`. All pass.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Generated-by: Cursor (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to