stevomitric commented on code in PR #56577:
URL: https://github.com/apache/spark/pull/56577#discussion_r3434950560
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:
##########
@@ -428,27 +438,29 @@ object Cast extends QueryErrorsBase {
case (_: NumericType, _: NumericType) => true
case (_: AtomicType, _: StringType) => true
case (_: CalendarIntervalType, _: StringType) => true
- // SPARK-57293: narrowing a nanosecond-precision timestamp to its
microsecond counterpart
- // drops the sub-microsecond digits, so it is not allowed as a (silent)
store assignment.
- // This conversion stays explicit-only.
- case (_: TimestampNTZNanosType, TimestampNTZType) => false
- case (_: TimestampLTZNanosType, TimestampType) => false
- // SPARK-57323: DATE <-> nanosecond-precision timestamp requires an
explicit CAST in both
- // directions. nanos -> DATE silently drops time-of-day and
sub-microsecond digits (the same
- // rule as the narrowing above). DATE -> nanos is lossless (midnight, zero
sub-micro part),
- // but conversions involving the nanos types stay explicit-only while the
types are
- // unreleased: allowing the store assignment later is a compatible change,
revoking it is not.
- // Note this is stricter than micro DATE <-> TIMESTAMP[_NTZ], which the
catch-all below allows.
- case (DateType, _: TimestampLTZNanosType) => false
- case (DateType, _: TimestampNTZNanosType) => false
- case (_: TimestampLTZNanosType, DateType) => false
- case (_: TimestampNTZNanosType, DateType) => false
// SPARK-57490: same-family cross-precision nanosecond casts: widening
(e.g. TIMESTAMP_NTZ(7) ->
// TIMESTAMP_NTZ(9)) is lossless and allowed as a silent store assignment,
while narrowing
// (e.g. (9) -> (7)) drops sub-microsecond digits and stays explicit-only.
Equal precision is
- // already handled by the `from == to` short-circuit above.
+ // handled by the `from == to` short-circuit above; micros -> nanos
widening (e.g. TIMESTAMP_NTZ
+ // -> TIMESTAMP_NTZ(9)) is lossless and falls to the catch-all below.
case (f: TimestampNTZNanosType, t: TimestampNTZNanosType) => f.precision
<= t.precision
case (f: TimestampLTZNanosType, t: TimestampLTZNanosType) => f.precision
<= t.precision
+ // SPARK-57323: DATE <-> nanosecond-precision timestamp requires an
explicit CAST in both
+ // directions (nanos -> DATE drops fields; DATE -> nanos is lossless but
kept explicit-only
+ // while the nanos types are unreleased). Stricter than micro DATE <->
TIMESTAMP[_NTZ], which
+ // the catch-all below allows.
+ case (DateType, _: AnyTimestampNanoType) => false
+ case (_: AnyTimestampNanoType, DateType) => false
+ // SPARK-57293/57511: narrowing any nanosecond timestamp to a microsecond
timestamp drops the
+ // sub-microsecond digits, and cross-family casts additionally reinterpret
the value against the
+ // session time zone; both stay explicit-only rather than silent store
assignments while the
+ // nanos types are unreleased. This covers same-family narrowing (nanos ->
micro), cross-family
+ // nanos <-> nanos, and the mixed micro/nanos pairs at the precision-6
boundary. Only the
+ // all-micro TIMESTAMP <-> TIMESTAMP_NTZ pair stays store-assignable via
the catch-all below.
Review Comment:
This contradicts the SPARK-57490 note above (micros -> nanos widening …
falls to the catch-all below). Micros→nanos same-family widening also remains
store-assignable via the catch-all, so "Only the all-micro pair" isn't accurate.
Suggest:
```
// The all-micro TIMESTAMP <-> TIMESTAMP_NTZ pair and micros -> nanos
same-family widening stay
// store-assignable via the catch-all below; everything matched here is
explicit-only.
```
##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala:
##########
@@ -1398,6 +1457,143 @@ abstract class CastSuiteBase extends SparkFunSuite with
ExpressionEvalHelper {
}
}
+ test("cross-family nanos cast: timestamp_ltz to timestamp_ntz") {
+ // LTZ(p) denotes an absolute instant; LTZ(p) -> NTZ(q) renders it as the
wall-clock local
+ // date-time observed in the session zone (mirroring the micro TIMESTAMP
-> TIMESTAMP_NTZ
+ // conversion on the epoch-micros part) and re-floors the sub-microsecond
digits to q.
+ Seq(UTC, LA).foreach { zone =>
Review Comment:
The LA cases are at non-transition instants. Since the NTZ→LTZ path resolves
DST via LocalDateTime.atZone(zone) (same resolver as micro convertTz), consider
adding a gap case (2020-03-08 02:30 America/Los_Angeles, a non-existent local
time → java.time shifts forward) and a fall-back overlap (2020-11-01 01:30 →
earlier offset) for NTZ↔LTZ, asserting the result matches the corresponding
micro TIMESTAMP_NTZ↔TIMESTAMP cast. Optionally also
assert(!Cast.forceNullable(from, to)) in the contract test, mirroring the
null-safe micro pair.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]