MaxGekk opened a new pull request, #56638: URL: https://github.com/apache/spark/pull/56638
### What changes were proposed in this pull request? Extend `findWiderDateTimeType` in `TypeCoercionHelper` to cover the nanosecond-capable timestamp types `TIMESTAMP_LTZ(p)` / `TIMESTAMP_NTZ(p)` (`p` in `[7, 9]`). The method is generalized into a total function over datetime pairs (except `TimeType`, which stays `None`) that widens along two independent axes: - **Time-zone family**: the result is LTZ if either input is LTZ-family (`TimestampType` / `TimestampLTZNanosType`), otherwise NTZ. This mirrors the existing microsecond precedent where `TIMESTAMP` + `TIMESTAMP_NTZ` widens to `TIMESTAMP`. `DATE` is family-neutral and adopts the family of the other side. - **Precision**: the maximum of the two precisions, where the micro types and `DATE` count as 6 and the nanos types contribute their own `p`. The resulting `(family, precision)` pair maps back to a concrete type: precision 6 yields the micro type (`TimestampType` / `TimestampNTZType`), precision in `[7, 9]` yields the nanos type (`TimestampLTZNanosType(p)` / `TimestampNTZNanosType(p)`). This reproduces every existing microsecond result exactly and adds: - nanos(p) + micro (same or mixed family) -> nanos at precision p - nanos(p1) + nanos(p2) -> nanos at max(p1, p2) - nanos(p) + DATE -> nanos(p) Because both `TypeCoercion.findTightestCommonType` and `AnsiTypeCoercion.findTightestCommonType` route `(d1: DatetimeType, d2: DatetimeType)` through this single shared method (and `findWiderTypeForTwo` / `findWiderCommonType` / the binary-operator coercion path build on it), the change applies to `UNION` / `CASE` / `coalesce` / `IN` / binary-comparison coercion in both standard and ANSI modes with no further wiring. String promotion (nanos + string -> string) already works because the nanos types are `AtomicType`. ### Why are the changes needed? Umbrella: SPARK-56822 (Timestamps with nanosecond precision). Previously `findWiderDateTimeType` was microsecond-only and had no default arm, so any pair involving a nanosecond type (micro+nanos or nanos(p1)+nanos(p2)) was unhandled and could fail analysis. This brings implicit type coercion and widening to parity with the microsecond `TimestampType` / `TimestampNTZType`. ### Does this PR introduce _any_ user-facing change? No. The nanosecond-capable timestamp types are gated behind a preview flag (`spark.sql.timestampNanosTypes.enabled`) and are not yet released. ### How was this patch tested? - Added nanosecond widening cases to `TypeCoercionSuite` and `AnsiTypeCoercionSuite` (`findTightestCommonType` and `findWiderTypeForTwo`): nanos/nanos, micro/nanos, mixed time-zone families, nanos/date, nanos/time (-> `None`), and nanos/string. - Added a new end-to-end `TimestampNanosWideningSuiteBase` with ANSI-on and ANSI-off subclasses, covering `UNION ALL`, `coalesce`, `IN`, `CASE WHEN`, and binary comparisons (`=`, `<`) across micro/nanos and nanos(p1)/nanos(p2) for both the LTZ and NTZ families. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor (Claude Opus 4.8) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
