[ 
https://issues.apache.org/jira/browse/SPARK-57257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-57257:
--------------------------------

    Assignee: Max Gekk

> Support nanosecond-precision timestamps in Hive results
> -------------------------------------------------------
>
>                 Key: SPARK-57257
>                 URL: https://issues.apache.org/jira/browse/SPARK-57257
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Major
>              Labels: pull-request-available
>
> h2. What
> Modify {{HiveResult}} to support the nanosecond-precision timestamp types 
> {{TIMESTAMP_LTZ(p)}} ({{TimestampLTZNanosType}}) and {{TIMESTAMP_NTZ(p)}} 
> ({{TimestampNTZNanosType}}), {{p}} in [7, 9].
> Add cases to {{HiveResult.toHiveStringDefault}} mirroring the existing 
> microsecond timestamp cases:
> * {{(i: Instant, _: TimestampLTZNanosType)}} -> render in the session time 
> zone.
> * {{(l: LocalDateTime, _: TimestampNTZNanosType)}} -> render 
> zone-independently.
> Both render with the nanosecond-aware {{TimestampFormatter}} (SPARK-57162) at 
> the column's fractional-second precision {{p}}, flooring sub-{{p}} digits and 
> trimming trailing zeros, consistent with casting these types to string. 
> {{getTimeFormatters}} already constructs a {{FractionTimestampFormatter}} via 
> {{TimestampFormatter.getFractionFormatter}}, which now exposes 
> {{formatNanos}} / {{formatWithoutTimeZoneNanos}}.
> h2. Why
> Before the change, formatting a nanosecond timestamp column through 
> {{HiveResult}} (e.g. end-to-end SQL / golden-file tests, {{spark-sql}} CLI, 
> Thrift server output) hits the catch-all match and fails with a 
> {{MatchError}}, analogous to the {{TimeType}} issue fixed in SPARK-51517:
> {code}
> scala.MatchError
> (2020-01-01T00:00:00.123456789Z, TimestampLTZNanosType(9)) (of class 
> scala.Tuple2)
> {code}
> The existing cases at {{HiveResult.scala}} match only the microsecond 
> {{TimestampType}} / {{TimestampNTZType}}, so the parameterized nanos types 
> are not handled.
> h2. Does this PR introduce any user-facing change?
> It fixes the error above. After the change, nanosecond timestamp values are 
> rendered as proper strings in Hive results (only reachable when 
> {{spark.sql.timestampNanosTypes.enabled=true}}).
> h2. Dependency
> Builds on SPARK-57162 (nanosecond-aware {{TimestampFormatter}}).
> h2. How tested
> * New cases in {{HiveResultSuite}} covering {{TIMESTAMP_LTZ(p)}} / 
> {{TIMESTAMP_NTZ(p)}} for {{p}} in [7, 9]: precision-driven fraction width, 
> trailing-zero trimming, {{nanosWithinMicro}} 0 and 999, LTZ session-zone 
> rendering vs. zone-independent NTZ, and nested (array/map/struct) values.
> * A golden-file end-to-end test (as SPARK-51517 added {{time.sql}}), disabled 
> in {{ThriftServerQueryTestSuite}} if needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to