MaxGekk commented on code in PR #56638:
URL: https://github.com/apache/spark/pull/56638#discussion_r3450846846


##########
sql/core/src/test/resources/sql-tests/results/timestamp-ltz-nanos.sql.out:
##########
@@ -854,3 +854,140 @@ SELECT unix_nanos(NULL :: timestamp_ltz(9))
 struct<unix_nanos(CAST(NULL AS TIMESTAMP_LTZ(9))):decimal(21,0)>
 -- !query output
 NULL
+
+
+-- !query
+SELECT typeof(c), c FROM (
+    SELECT TIMESTAMP_LTZ '0001-01-01 00:00:00' AS c
+    UNION ALL SELECT TIMESTAMP_LTZ '9999-12-31 23:59:59.999999999') ORDER BY c
+-- !query schema
+struct<typeof(c):string,c:timestamp_ltz(9)>
+-- !query output
+timestamp_ltz(9)       0001-01-01 00:00:00
+timestamp_ltz(9)       9999-12-31 23:59:59.999999999
+
+
+-- !query
+SELECT typeof(c), c FROM (
+    SELECT '1582-10-04 12:30:45.1234567' :: timestamp_ltz(7) AS c
+    UNION ALL SELECT '1582-10-15 23:59:59.123456789' :: timestamp_ltz(9)) 
ORDER BY c
+-- !query schema
+struct<typeof(c):string,c:timestamp_ltz(9)>
+-- !query output
+timestamp_ltz(9)       1582-10-04 12:30:45.1234567
+timestamp_ltz(9)       1582-10-15 23:59:59.123456789
+
+
+-- !query
+SELECT typeof(v), v FROM (SELECT coalesce(
+    '1969-12-31 23:59:59.0000001 Asia/Kolkata' :: timestamp_ltz(7),
+    '1969-12-31 23:59:59.999999999 UTC' :: timestamp_ltz(9)) AS v)
+-- !query schema
+struct<typeof(v):string,v:timestamp_ltz(9)>
+-- !query output
+timestamp_ltz(9)       1969-12-31 10:29:59.0000001
+
+
+-- !query
+SELECT typeof(v), v FROM (SELECT CASE WHEN true
+    THEN TIMESTAMP_LTZ '2026-06-21 10:16:30 Asia/Kathmandu'
+    ELSE '2026-06-21 10:16:30.987654321 UTC' :: timestamp_ltz(9) END AS v)
+-- !query schema
+struct<typeof(v):string,v:timestamp_ltz(9)>
+-- !query output
+timestamp_ltz(9)       2026-06-20 21:31:30
+
+
+-- !query
+SELECT typeof(v), v FROM (SELECT coalesce(
+    DATE '0001-01-01', '2020-01-01 00:00:00.12345678' :: timestamp_ltz(8)) AS 
v)
+-- !query schema
+struct<typeof(v):string,v:timestamp_ltz(8)>
+-- !query output
+timestamp_ltz(8)       0001-01-01 00:00:00
+
+
+-- !query
+SELECT typeof(greatest(TIMESTAMP_LTZ '0001-01-01 00:00:00',
+    '9999-12-31 23:59:59.999999999' :: timestamp_ltz(9)))
+-- !query schema
+struct<typeof(greatest(TIMESTAMP '0001-01-01 00:00:00', CAST(9999-12-31 
23:59:59.999999999 AS TIMESTAMP_LTZ(9)))):string>
+-- !query output
+timestamp_ltz(9)
+
+
+-- !query
+SELECT greatest(TIMESTAMP_LTZ '1500-03-01 12:00:00',
+    '1582-10-15 00:00:00.123456789' :: timestamp_ltz(9),
+    TIMESTAMP_LTZ '2026-06-21 10:16:30.5')
+-- !query schema
+struct<greatest(TIMESTAMP '1500-03-01 12:00:00', CAST(1582-10-15 
00:00:00.123456789 AS TIMESTAMP_LTZ(9)), TIMESTAMP '2026-06-21 
10:16:30.5'):timestamp_ltz(9)>
+-- !query output
+2026-06-21 10:16:30.5
+
+
+-- !query
+SELECT least('1970-01-01 00:00:00.0000001' :: timestamp_ltz(7),
+    '1969-12-31 23:59:59.999999999' :: timestamp_ltz(9))
+-- !query schema
+struct<least(CAST(1970-01-01 00:00:00.0000001 AS TIMESTAMP_LTZ(7)), 
CAST(1969-12-31 23:59:59.999999999 AS TIMESTAMP_LTZ(9))):timestamp_ltz(9)>
+-- !query output
+1969-12-31 23:59:59.999999999
+
+
+-- !query
+SELECT array('0001-01-01 00:00:00.0000001' :: timestamp_ltz(7),
+    TIMESTAMP_LTZ '2026-06-21 10:16:30 Asia/Kolkata',
+    '9999-12-31 23:59:59.999999999' :: timestamp_ltz(9))
+-- !query schema
+struct<array(CAST(0001-01-01 00:00:00.0000001 AS TIMESTAMP_LTZ(7)), TIMESTAMP 
'2026-06-20 21:46:30', CAST(9999-12-31 23:59:59.999999999 AS 
TIMESTAMP_LTZ(9))):array<timestamp_ltz(9)>>
+-- !query output
+[0001-01-01 00:00:00.0000001,2026-06-20 21:46:30,9999-12-31 23:59:59.999999999]
+
+
+-- !query
+SELECT typeof(array(TIMESTAMP_LTZ '9999-12-31 23:59:59',
+    '0001-01-01 00:00:00.000000001' :: timestamp_ltz(9)))
+-- !query schema
+struct<typeof(array(TIMESTAMP '9999-12-31 23:59:59', CAST(0001-01-01 
00:00:00.000000001 AS TIMESTAMP_LTZ(9)))):string>
+-- !query output
+array<timestamp_ltz(9)>
+
+
+-- !query
+SELECT map('min', '0001-01-01 00:00:00.000000001' :: timestamp_ltz(9),
+    'max', TIMESTAMP_LTZ '9999-12-31 23:59:59.999999')
+-- !query schema
+struct<map(min, CAST(0001-01-01 00:00:00.000000001 AS TIMESTAMP_LTZ(9)), max, 
TIMESTAMP '9999-12-31 23:59:59.999999'):map<string,timestamp_ltz(9)>>
+-- !query output
+{"max":9999-12-31 23:59:59.999999,"min":0001-01-01 00:00:00.000000001}
+
+
+-- !query
+SELECT typeof(c) FROM (
+    SELECT TIMESTAMP_NTZ '1582-10-15 00:00:00' AS c

Review Comment:
   Good catch - added a value-pinned mixed-family case in 789297e3296. It uses 
`coalesce(TIMESTAMP_NTZ '2026-06-21 10:16:30.123456789', ... :: 
timestamp_ltz(9))`, where the result is the NTZ branch routed through the 
inserted cross-family cast. The cast reinterprets the wall clock in the session 
zone (America/Los_Angeles) and the result renders back there, so it round-trips 
to `2026-06-21 10:16:30.123456789` with the sub-microsecond digits preserved. A 
UTC misread would render a different instant, so this now locks the 
`sessionLocalTimeZone` wiring. The remaining mixed-family cases stay type-only 
since their value is session-zone dependent.



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala:
##########
@@ -219,6 +228,25 @@ class AnsiTypeCoercionSuite extends TypeCoercionSuiteBase {
     widenTest(IntegerType, TimestampType, None)
     widenTest(StringType, TimestampType, None)
 
+    // Nanosecond-precision timestamp types (SPARK-57454).

Review Comment:
   You're right, the asymmetry was accidental. Mirrored the three missing cells 
in 789297e3296 - `TimestampNTZNanosType(9) + TimeType(6) -> None`, 
`TimestampLTZNanosType(7) + TimestampNTZType -> TimestampLTZNanosType(7)`, and 
the `nanos(8)` self-pair - so both `findTightestCommonType` impls now cover the 
same matrix. Also added a short note that the block is kept in sync with the 
TypeCoercionSuite one (they share `findWiderDateTimeType`).



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionHelper.scala:
##########
@@ -244,14 +247,54 @@ abstract class TypeCoercionHelper {
     (d1, d2) match {
       case (_, _: TimeType) => None
       case (_: TimeType, _) => None
-      case (_: TimestampType, _: DateType) | (_: DateType, _: TimestampType) =>
-        Some(TimestampType)
 
-      case (_: TimestampType, _: TimestampNTZType) | (_: TimestampNTZType, _: 
TimestampType) =>
-        Some(TimestampType)
-
-      case (_: TimestampNTZType, _: DateType) | (_: DateType, _: 
TimestampNTZType) =>
-        Some(TimestampNTZType)
+      // The remaining datetime types (DATE and the micro/nanos TIMESTAMP_LTZ 
/ TIMESTAMP_NTZ
+      // families) widen along two independent axes:
+      //   - time-zone family: the result is LTZ if either input is 
LTZ-family, otherwise NTZ. This
+      //     mirrors the microsecond precedent where TIMESTAMP + TIMESTAMP_NTZ 
widens to TIMESTAMP.
+      //     DATE is family-neutral and adopts the family of the other side.
+      //   - precision: the maximum of the two precisions, where the micro 
types and DATE count as 6
+      //     and the nanos types contribute their own precision p in [7, 9].
+      // The (family, precision) pair then maps back to a concrete type: 
precision 6 yields the
+      // micro type, precision in [7, 9] yields the nanos type.
+      //
+      // Note: this common-type resolution is intentionally more permissive 
than the nanosecond
+      // conversion rules in Cast.canUpCast / Cast.canANSIStoreAssign, which 
keep cross-family and
+      // DATE <-> nanos casts explicit-CAST-only while the nanos types are 
unreleased (SPARK-57323
+      // etc.). Coercion here mirrors the microsecond precedent so that UNION 
/ CASE / coalesce /
+      // IN / comparison resolve a common type the same way they do for the 
micro families; the
+      // stricter explicit-only stance is deliberately scoped to up-cast and 
store assignment, not
+      // to common-type resolution.
+      case _ =>
+        def isLtz(d: DatetimeType): Boolean =
+          d.isInstanceOf[TimestampType] || 
d.isInstanceOf[TimestampLTZNanosType]
+        def isNtz(d: DatetimeType): Boolean =
+          d.isInstanceOf[TimestampNTZType] || 
d.isInstanceOf[TimestampNTZNanosType]
+        def precisionOf(d: DatetimeType): Int = d match {
+          case t: TimestampLTZNanosType => t.precision
+          case t: TimestampNTZNanosType => t.precision
+          case _ => 6 // DateType / TimestampType / TimestampNTZType

Review Comment:
   Done in 789297e3296 - introduced a local `val MicrosPrecision = 6` and used 
it in both `precisionOf` and the two `p <= 6` checks, with a comment noting 
DATE is treated as the micro precision so DATE <-> micro/nanos widens correctly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to