[ 
https://issues.apache.org/jira/browse/SPARK-57285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57285:
-----------------------------------
    Labels: pull-request-available  (was: )

> Route nanosecond timestamp cast-to-string through the Types Framework in both 
> interpreted and codegen paths
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-57285
>                 URL: https://issues.apache.org/jira/browse/SPARK-57285
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Background
> SPARK-57256 implemented {{CAST(TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) AS 
> STRING)}} for p in [7, 9]. The formatting currently lives in {{ToStringBase}} 
> (alongside the microsecond timestamp types): the interpreted path explicitly 
> bypasses {{TypeApiOps}}, and the codegen path inlines 
> {{TimestampFormatter.formatNanos}} / {{formatWithoutTimeZoneNanos}}. This was 
> done because the Types Framework {{TypeApiOps.format(v)}} is zone-less and 
> cannot render LTZ in the session time zone, so it deliberately still raises 
> {{UNSUPPORTED_FEATURE.TIMESTAMP_NANOS_TO_STRING}} for the zone-less callers.
> This leaves nanosecond cast-to-string as a one-off integration outside the 
> framework, which is inconsistent with the SPIP direction of wiring the new 
> types through the centralized {{TypeOps}} / {{TypeApiOps}} (see SPARK-57101 / 
> SPARK-57207).
> h2. Goal
> Make the Types Framework the single integration point for nanosecond 
> timestamp cast-to-string, for both the interpreted and codegen paths, while 
> producing the same output as SPARK-57256 (zone-aware LTZ, zone-independent 
> NTZ, precision flooring, trailing-zero trimming).
> h2. Proposed approach
> * Interpreted path: extend the framework formatting hook with the session 
> zone (e.g. an optional {{zoneId}} parameter on {{format}} / {{formatUTF8}}), 
> and implement zone-aware formatting in {{TimestampNTZNanosTypeApiOps}} / 
> {{TimestampLTZNanosTypeApiOps}} using the sql/api {{TimestampFormatter}} 
> ({{formatWithoutTimeZoneNanos}} for NTZ, {{formatNanos}} with {{zoneId}} for 
> LTZ). Thread {{ToStringBase}}'s {{zoneId}} into the dispatch, then remove the 
> {{castToStringDefault}} nanos cases and the current {{TypeApiOps}} bypass.
> * Codegen path: {{TypeApiOps}} has no codegen hook today (each type is 
> hand-written in {{ToStringBase.castToStringCode}}). Add a framework codegen 
> hook (a method that emits the format snippet), or have {{castToStringCode}} 
> emit a runtime call into the ops reference object passing the {{zoneId}} 
> literal; then drop the inlined {{formatNanos}} cases.
> * Zone-less callers: reconcile {{format()}} / {{toSQLValue()}} (EXPLAIN, 
> SQL-literal rendering). NTZ needs no zone and can format directly; LTZ 
> without a session zone keeps raising (or uses a documented default). Update 
> {{TimestampNanosTypeOpsSuite}} accordingly.
> h2. Out of scope
> * The microsecond timestamp types ({{TIMESTAMP}} / {{TIMESTAMP_NTZ}}), which 
> remain handled inline in {{ToStringBase}}.
> * Any change to the rendered string output: this is a refactor with no 
> user-facing behavior change.
> h2. Testing
> Existing {{CastWithAnsiOnSuite}} / {{CastWithAnsiOffSuite}}, 
> {{ToPrettyStringSuite}}, {{TimestampNanosRowSuite}}, and the {{cast.sql}} 
> golden files must stay green unchanged; add framework-level coverage for the 
> new zone-aware {{format}} hook in both eval modes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to