Max Gekk created SPARK-57257:
--------------------------------

             Summary: Support nanosecond-precision timestamps in Hive results
                 Key: SPARK-57257
                 URL: https://issues.apache.org/jira/browse/SPARK-57257
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk


h2. What

Modify {{HiveResult}} to support the nanosecond-precision timestamp types 
{{TIMESTAMP_LTZ(p)}} ({{TimestampLTZNanosType}}) and {{TIMESTAMP_NTZ(p)}} 
({{TimestampNTZNanosType}}), {{p}} in [7, 9].

Add cases to {{HiveResult.toHiveStringDefault}} mirroring the existing 
microsecond timestamp cases:
* {{(i: Instant, _: TimestampLTZNanosType)}} -> render in the session time zone.
* {{(l: LocalDateTime, _: TimestampNTZNanosType)}} -> render zone-independently.

Both render with the nanosecond-aware {{TimestampFormatter}} (SPARK-57162) at 
the column's fractional-second precision {{p}}, flooring sub-{{p}} digits and 
trimming trailing zeros, consistent with casting these types to string. 
{{getTimeFormatters}} already constructs a {{FractionTimestampFormatter}} via 
{{TimestampFormatter.getFractionFormatter}}, which now exposes {{formatNanos}} 
/ {{formatWithoutTimeZoneNanos}}.

h2. Why

Before the change, formatting a nanosecond timestamp column through 
{{HiveResult}} (e.g. end-to-end SQL / golden-file tests, {{spark-sql}} CLI, 
Thrift server output) hits the catch-all match and fails with a {{MatchError}}, 
analogous to the {{TimeType}} issue fixed in SPARK-51517:

{code}
scala.MatchError
(2020-01-01T00:00:00.123456789Z, TimestampLTZNanosType(9)) (of class 
scala.Tuple2)
{code}

The existing cases at {{HiveResult.scala}} match only the microsecond 
{{TimestampType}} / {{TimestampNTZType}}, so the parameterized nanos types are 
not handled.

h2. Does this PR introduce any user-facing change?

It fixes the error above. After the change, nanosecond timestamp values are 
rendered as proper strings in Hive results (only reachable when 
{{spark.sql.timestampNanosTypes.enabled=true}}).

h2. Dependency

Builds on SPARK-57162 (nanosecond-aware {{TimestampFormatter}}).

h2. How tested

* New cases in {{HiveResultSuite}} covering {{TIMESTAMP_LTZ(p)}} / 
{{TIMESTAMP_NTZ(p)}} for {{p}} in [7, 9]: precision-driven fraction width, 
trailing-zero trimming, {{nanosWithinMicro}} 0 and 999, LTZ session-zone 
rendering vs. zone-independent NTZ, and nested (array/map/struct) values.
* A golden-file end-to-end test (as SPARK-51517 added {{time.sql}}), disabled 
in {{ThriftServerQueryTestSuite}} if needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to