Max Gekk created SPARK-57293:
--------------------------------
Summary: Cast between nanosecond-precision and
microsecond-precision timestamp types
Key: SPARK-57293
URL: https://issues.apache.org/jira/browse/SPARK-57293
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 5.0.0
Reporter: Max Gekk
h3. Background
Nanosecond-precision timestamp types ({{TIMESTAMP_NTZ(p)}} /
{{TIMESTAMP_LTZ(p)}}, with {{p}} in [7, 9], backed by {{TimestampNanosVal}})
currently support parsing from strings (SPARK-57211) and rendering to strings
(SPARK-57256). There is no cast between a nanosecond-precision type and its
microsecond-precision counterpart, so values cannot move between
{{TIMESTAMP_NTZ(p)}} and {{TIMESTAMP_NTZ}}, or between {{TIMESTAMP_LTZ(p)}} and
{{TIMESTAMP_LTZ}}.
h3. Goal
Support explicit casts for the four pairs, in both the interpreted and codegen
paths:
* {{TIMESTAMP_NTZ}} -> {{TIMESTAMP_NTZ(p)}} and back
* {{TIMESTAMP_LTZ}} -> {{TIMESTAMP_LTZ(p)}} and back
h3. Semantics
Both directions stay within a single zone family, so they are pure
representation conversions with no timezone involvement:
* Widening (micros -> nanos): {{nanosWithinMicro}} is set to 0; lossless and
independent of the target precision {{p}}.
* Narrowing (nanos -> micros): take {{epochMicros}}, dropping the
sub-microsecond digits. Truncation toward the past (floor), consistent with how
microsecond timestamps are already produced; silent in both ANSI and non-ANSI
modes (matching Spark's existing silent fractional-second truncation for
timestamp casts).
h3. Approach
Wire the four pairs in {{Cast}}: register them in {{canCast}}/{{canAnsiCast}},
add interpreted cases to
{{castToTimestamp}}/{{castToTimestampNTZ}}/{{castToTimestampLTZNanos}}/{{castToTimestampNTZNanos}},
and mirror them in the corresponding codegen helpers. No new
{{Cast.needsTimeZone}} entries are required. The preview flag
{{spark.sql.timestampNanosTypes.enabled}} continues to gate the
nanosecond-typed side.
h3. Out of scope
* Precision-to-precision casts within the nanosecond family
({{TIMESTAMP_NTZ(p1)}} -> {{TIMESTAMP_NTZ(p2)}}).
* Cross-family casts ({{TIMESTAMP_LTZ(p)}} <-> {{TIMESTAMP_NTZ(p)}}), which
would require timezone handling.
* Implicit/up-cast and store-assignment coercion; these casts remain
explicit-only, consistent with the existing string<->nanos casts.
h3. Testing
Add coverage in {{CastSuiteBase}} (widening, narrowing/truncation, round-trip,
null) exercised by both ANSI-on/off and interpreted/codegen variants; optional
end-to-end golden coverage in {{cast.sql}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]