[PR] [SPARK-57256][SQL] Cast nanosecond-precision timestamps to string [spark]

via GitHub Thu, 04 Jun 2026 01:24:02 -0700


MaxGekk opened a new pull request, #56317:
URL: https://github.com/apache/spark/pull/56317


   ### What changes were proposed in this pull request?
   Implement casting of the nanosecond-precision timestamp types 
`TIMESTAMP_NTZ(p)` (`TimestampNTZNanosType`) and `TIMESTAMP_LTZ(p)` 
(`TimestampLTZNanosType`), `p` in [7, 9], to `STRING`.
   
   Casting is implemented in `ToStringBase` (mixed into `Cast`), so this change 
also fixes `ToPrettyString` (and therefore `Dataset.show()`) for these types 
via the shared base.
   
   The change wires the 
[SPARK-57162](https://issues.apache.org/jira/browse/SPARK-57162) formatter 
methods into the existing cast-to-string paths (interpreted and codegen):
   - `TimestampLTZNanosType(p)` -> `TimestampFormatter.formatNanos(v, p)` 
(renders in the session time zone).
   - `TimestampNTZNanosType(p)` -> 
`TimestampFormatter.formatWithoutTimeZoneNanos(v, p)` (zone-independent, UTC 
wall-clock grid).
   
   The fractional-second precision `p` is taken from the source type; sub-`p` 
digits are floored and trailing zeros are trimmed, consistent with the 
microsecond cast path (both use `FractionTimestampFormatter`).
   
   `Cast.needsTimeZone` is extended so that `TimestampLTZNanosType -> 
StringType` resolves the session time zone (mirroring `TimestampType -> 
StringType`); the NTZ variant does not need a time zone.
   
   ### Why are the changes needed?
   Today `Cast` permits these casts at analysis time (the generic `(_, 
StringType)` rule), but at runtime the nanosecond types have no dedicated case 
in `ToStringBase` and fall through to the default `String.valueOf(...)` branch, 
producing the internal form `TimestampNanosVal(epochMicros, nanosWithinMicro)` 
instead of a proper SQL timestamp string. Producing a correct textual 
representation is a prerequisite for nanosecond support in expressions, 
SHOW/pretty output, and downstream text-based sinks.
   
   ### Does this PR introduce _any_ user-facing change?
   User-facing only when `spark.sql.timestampNanosTypes.enabled=true`; these 
types are not available otherwise. Casting to string never fails, so ANSI and 
non-ANSI modes behave identically.
   
   With `spark.sql.timestampNanosTypes.enabled=true`:
   ```sql
   SELECT CAST(ts AS STRING);
   -- TIMESTAMP_NTZ(9) value 2020-01-01 00:00:00.123456789
   --   before: TimestampNanosVal(1577836800000000, 789)
   --   after:  2020-01-01 00:00:00.123456789
   ```
   
   ### How was this patch tested?
   New cases in `CastSuiteBase` (run under both ANSI on/off; `checkEvaluation` 
exercises the interpreted and codegen paths): precision 7/8/9, trailing-zero 
trimming, `nanosWithinMicro` 0 and 999, LTZ time-zone shift under a non-UTC 
session zone vs. NTZ remaining unshifted, pre-epoch and year-9999 boundaries, 
and null input.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Generated-by: Cursor


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-57256][SQL] Cast nanosecond-precision timestamps to string [spark]

Reply via email to