MaxGekk opened a new pull request, #56577:
URL: https://github.com/apache/spark/pull/56577

   ### What changes were proposed in this pull request?
   
   This PR adds explicit `CAST` support between the nanosecond-capable 
`TIMESTAMP_LTZ(p)` and `TIMESTAMP_NTZ(q)` types for `p, q in [6, 9]`, i.e.:
   
   - `CAST(<timestamp_ltz(p)> AS TIMESTAMP_NTZ(q))`
   - `CAST(<timestamp_ntz(p)> AS TIMESTAMP_LTZ(q))`
   
   Recall that the parser maps precision `6` to the microsecond family members 
(`TIMESTAMP_LTZ(6)` = `TIMESTAMP`, `TIMESTAMP_NTZ(6)` = `TIMESTAMP_NTZ`), so 
the full matrix is covered by two groups:
   
   - nanos <-> nanos: `TimestampLTZNanosType(p) <-> TimestampNTZNanosType(q)` 
for `p, q in [7, 9]`;
   - the precision-6 boundary, which mixes a micro family member with the other 
family's nanos member: `TIMESTAMP <-> TimestampNTZNanosType(q)` and 
`TIMESTAMP_NTZ <-> TimestampLTZNanosType(p)`.
   
   Concretely, in `Cast.scala`:
   - `canCast` / `canAnsiCast`: allow the cross-family directions above.
   - `needsTimeZone`: the conversion reinterprets an absolute instant (LTZ) as 
a wall-clock local date-time (NTZ) and vice versa, so it is session-time-zone 
dependent (mirroring micro `TIMESTAMP <-> TIMESTAMP_NTZ`).
   - `canANSIStoreAssign`: the cross-family casts stay explicit-only (not 
silent store assignments) while the nanosecond types are unreleased; the 
all-micro `TIMESTAMP <-> TIMESTAMP_NTZ` pair remains store-assignable.
   - Interpreted and codegen conversion paths for the new source/target 
combinations, reusing the existing `convertTz` micro semantics plus precision 
flooring for the sub-microsecond part.
   
   `SparkDateTimeUtils.scala` gains two small helpers, 
`timestampLTZNanosToNTZNanos` and `timestampNTZNanosToLTZNanos`, that compose 
the existing public conversion utilities (`timestampNanosToInstant`, 
`localDateTimeToTimestampNanos`, etc.).
   
   Out of scope (unchanged): implicit type coercion / common-type resolution 
(e.g. `CASE`, `UNION` wider-type inference).
   
   ### Why are the changes needed?
   
   Spark already supports same-family cross-precision nanos casts 
(`TIMESTAMP_NTZ(p) -> TIMESTAMP_NTZ(q)`, `TIMESTAMP_LTZ(p) -> 
TIMESTAMP_LTZ(q)`) and the microsecond `TIMESTAMP <-> TIMESTAMP_NTZ` casts, but 
direct cross-family casts between `TIMESTAMP_LTZ(p)` and `TIMESTAMP_NTZ(q)` 
were not supported for `p, q in [6, 9]`. This closes that gap so explicit 
`CAST(...)` has consistent timestamp parity. This is a sub-task of SPARK-56822 
(SPIP: Timestamps with nanosecond precision).
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, within the unreleased nanosecond-timestamp feature (gated by 
`spark.sql.timestampNanosTypes.enabled`). Explicit casts that previously failed 
type checking, e.g.:
   
   ```sql
   SELECT CAST(TIMESTAMP_LTZ '2020-01-01 00:00:00.123456789' AS 
TIMESTAMP_NTZ(7));
   SELECT CAST(TIMESTAMP_NTZ '2020-01-01 00:00:00.123456789' AS 
TIMESTAMP_LTZ(9));
   ```
   
   now succeed, reinterpreting the value against the session time zone and 
flooring the fractional second to the target precision. There is no change to 
already-released types.
   
   ### How was this patch tested?
   
   - Added catalyst unit tests in `CastSuiteBase` (run under both ANSI on/off): 
admissibility / store-assignment / up-cast / `needsTimeZone` contracts 
(nanos<->nanos and the micro boundary), value tests across `p, q in [7, 9]` in 
UTC and `America/Los_Angeles` (widening keeps a zero sub-microsecond part, 
narrowing floors, pre-epoch and null cases), a round-trip test, and the micro 
family member (precision 6) to/from nanos test.
   - Extended `cast.sql` golden coverage (type resolution, lossless same-zone 
round-trips, narrowing truncation, null propagation, and the precision-6 
boundary) and regenerated the result/analyzer goldens, including the non-ANSI 
variants.
   - `./dev/scalastyle` passes.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Cursor


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to