MaxGekk opened a new pull request, #56638:
URL: https://github.com/apache/spark/pull/56638

   ### What changes were proposed in this pull request?
   
   Extend `findWiderDateTimeType` in `TypeCoercionHelper` to cover the 
nanosecond-capable timestamp types `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` (`p` 
in `[7, 9]`). The method is generalized into a total function over datetime 
pairs (except `TimeType`, which stays `None`) that widens along two independent 
axes:
   
   - **Time-zone family**: the result is LTZ if either input is LTZ-family 
(`TimestampType` / `TimestampLTZNanosType`), otherwise NTZ. This mirrors the 
existing microsecond precedent where `TIMESTAMP` + `TIMESTAMP_NTZ` widens to 
`TIMESTAMP`. `DATE` is family-neutral and adopts the family of the other side.
   - **Precision**: the maximum of the two precisions, where the micro types 
and `DATE` count as 6 and the nanos types contribute their own `p`.
   
   The resulting `(family, precision)` pair maps back to a concrete type: 
precision 6 yields the micro type (`TimestampType` / `TimestampNTZType`), 
precision in `[7, 9]` yields the nanos type (`TimestampLTZNanosType(p)` / 
`TimestampNTZNanosType(p)`). This reproduces every existing microsecond result 
exactly and adds:
   
   - nanos(p) + micro (same or mixed family) -> nanos at precision p
   - nanos(p1) + nanos(p2) -> nanos at max(p1, p2)
   - nanos(p) + DATE -> nanos(p)
   
   Because both `TypeCoercion.findTightestCommonType` and 
`AnsiTypeCoercion.findTightestCommonType` route `(d1: DatetimeType, d2: 
DatetimeType)` through this single shared method (and `findWiderTypeForTwo` / 
`findWiderCommonType` / the binary-operator coercion path build on it), the 
change applies to `UNION` / `CASE` / `coalesce` / `IN` / binary-comparison 
coercion in both standard and ANSI modes with no further wiring. String 
promotion (nanos + string -> string) already works because the nanos types are 
`AtomicType`.
   
   ### Why are the changes needed?
   
   Umbrella: SPARK-56822 (Timestamps with nanosecond precision). Previously 
`findWiderDateTimeType` was microsecond-only and had no default arm, so any 
pair involving a nanosecond type (micro+nanos or nanos(p1)+nanos(p2)) was 
unhandled and could fail analysis. This brings implicit type coercion and 
widening to parity with the microsecond `TimestampType` / `TimestampNTZType`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The nanosecond-capable timestamp types are gated behind a preview flag 
(`spark.sql.timestampNanosTypes.enabled`) and are not yet released.
   
   ### How was this patch tested?
   
   - Added nanosecond widening cases to `TypeCoercionSuite` and 
`AnsiTypeCoercionSuite` (`findTightestCommonType` and `findWiderTypeForTwo`): 
nanos/nanos, micro/nanos, mixed time-zone families, nanos/date, nanos/time (-> 
`None`), and nanos/string.
   - Added a new end-to-end `TimestampNanosWideningSuiteBase` with ANSI-on and 
ANSI-off subclasses, covering `UNION ALL`, `coalesce`, `IN`, `CASE WHEN`, and 
binary comparisons (`=`, `<`) across micro/nanos and nanos(p1)/nanos(p2) for 
both the LTZ and NTZ families.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Cursor (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to