MaxGekk opened a new pull request, #56317: URL: https://github.com/apache/spark/pull/56317
### What changes were proposed in this pull request? Implement casting of the nanosecond-precision timestamp types `TIMESTAMP_NTZ(p)` (`TimestampNTZNanosType`) and `TIMESTAMP_LTZ(p)` (`TimestampLTZNanosType`), `p` in [7, 9], to `STRING`. Casting is implemented in `ToStringBase` (mixed into `Cast`), so this change also fixes `ToPrettyString` (and therefore `Dataset.show()`) for these types via the shared base. The change wires the [SPARK-57162](https://issues.apache.org/jira/browse/SPARK-57162) formatter methods into the existing cast-to-string paths (interpreted and codegen): - `TimestampLTZNanosType(p)` -> `TimestampFormatter.formatNanos(v, p)` (renders in the session time zone). - `TimestampNTZNanosType(p)` -> `TimestampFormatter.formatWithoutTimeZoneNanos(v, p)` (zone-independent, UTC wall-clock grid). The fractional-second precision `p` is taken from the source type; sub-`p` digits are floored and trailing zeros are trimmed, consistent with the microsecond cast path (both use `FractionTimestampFormatter`). `Cast.needsTimeZone` is extended so that `TimestampLTZNanosType -> StringType` resolves the session time zone (mirroring `TimestampType -> StringType`); the NTZ variant does not need a time zone. ### Why are the changes needed? Today `Cast` permits these casts at analysis time (the generic `(_, StringType)` rule), but at runtime the nanosecond types have no dedicated case in `ToStringBase` and fall through to the default `String.valueOf(...)` branch, producing the internal form `TimestampNanosVal(epochMicros, nanosWithinMicro)` instead of a proper SQL timestamp string. Producing a correct textual representation is a prerequisite for nanosecond support in expressions, SHOW/pretty output, and downstream text-based sinks. ### Does this PR introduce _any_ user-facing change? User-facing only when `spark.sql.timestampNanosTypes.enabled=true`; these types are not available otherwise. Casting to string never fails, so ANSI and non-ANSI modes behave identically. With `spark.sql.timestampNanosTypes.enabled=true`: ```sql SELECT CAST(ts AS STRING); -- TIMESTAMP_NTZ(9) value 2020-01-01 00:00:00.123456789 -- before: TimestampNanosVal(1577836800000000, 789) -- after: 2020-01-01 00:00:00.123456789 ``` ### How was this patch tested? New cases in `CastSuiteBase` (run under both ANSI on/off; `checkEvaluation` exercises the interpreted and codegen paths): precision 7/8/9, trailing-zero trimming, `nanosWithinMicro` 0 and 999, LTZ time-zone shift under a non-UTC session zone vs. NTZ remaining unshifted, pre-epoch and year-9999 boundaries, and null input. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
