MaxGekk opened a new pull request, #56354: URL: https://github.com/apache/spark/pull/56354
### What changes were proposed in this pull request? This PR adds explicit casts between nanosecond-precision timestamp types and their microsecond-precision counterparts, in both the interpreted and codegen paths: - `TIMESTAMP_NTZ` <-> `TIMESTAMP_NTZ(p)` - `TIMESTAMP_LTZ` <-> `TIMESTAMP_LTZ(p)` (`p` in `[7, 9]`) Both directions stay within a single zone family, so they are pure representation conversions with no timezone involvement: - **Widening** (micros -> nanos): `TimestampNanosVal.fromParts(micros, 0)`. Lossless and independent of the target precision `p` (the sub-microsecond part is always 0). - **Narrowing** (nanos -> micros): takes `epochMicros`, dropping the sub-microsecond digits. Truncation toward the past (floor), consistent with how microsecond timestamps are already produced. Silent in both ANSI and non-ANSI modes, matching Spark's existing silent fractional-second truncation for timestamp casts. Implementation: - Registered the four pairs in `Cast.canCast` / `Cast.canAnsiCast`. - Added interpreted cases in `castToTimestamp` / `castToTimestampNTZ` (narrowing) and `castToTimestampLTZNanos` / `castToTimestampNTZNanos` (widening), and mirrored them in the corresponding codegen helpers. - No new `Cast.needsTimeZone` entries are required. The preview flag `spark.sql.timestampNanosTypes.enabled` continues to gate the nanosecond-typed side. Out of scope: precision-to-precision casts within the nanosecond family (`TIMESTAMP_NTZ(p1)` -> `TIMESTAMP_NTZ(p2)`), cross-family casts (`TIMESTAMP_LTZ(p)` <-> `TIMESTAMP_NTZ(p)`), and implicit/up-cast/store-assignment coercion. These casts remain explicit-only, consistent with the existing string<->nanos casts. This is a subtask of [SPARK-56822](https://issues.apache.org/jira/browse/SPARK-56822) (SPIP: Timestamps with nanosecond precision). ### Why are the changes needed? Nanosecond-precision timestamp types currently support parsing from strings (SPARK-57211) and rendering to strings (SPARK-57256), but there is no cast between a nanosecond-precision type and its microsecond-precision counterpart. As a result, values cannot move between `TIMESTAMP_NTZ(p)` and `TIMESTAMP_NTZ`, or between `TIMESTAMP_LTZ(p)` and `TIMESTAMP_LTZ`. This PR fills that gap. ### Does this PR introduce _any_ user-facing change? Yes, but only behind the preview flag `spark.sql.timestampNanosTypes.enabled`. When enabled, users can now explicitly cast between the four new pairs, e.g.: ```sql SELECT cast(cast('2020-01-01 00:00:00.123456789' as timestamp_ntz(9)) as timestamp_ntz); -- 2020-01-01 00:00:00.123456 ``` The nanosecond-precision types are an unreleased, preview feature, so this is not a change compared to any released Spark version. ### How was this patch tested? - Added unit tests in `CastSuiteBase` covering widening, narrowing/truncation (with a non-zero sub-microsecond part to prove flooring), round-trip, and null inputs for both NTZ and LTZ families. These run under `CastWithAnsiOnSuite` / `CastWithAnsiOffSuite` (ANSI on/off) and exercise both the interpreted and codegen paths. - Extended the existing `null cast` test to cover the four new pairs. - Added end-to-end golden coverage in `cast.sql` and regenerated the four `cast.sql.out` golden files. - Verified style with scalastyle. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor 2.0 (Claude Opus 4.8) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
