[
https://issues.apache.org/jira/browse/SPARK-57317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk resolved SPARK-57317.
------------------------------
Fix Version/s: 4.3.0
Resolution: Fixed
Issue resolved by pull request 56371
[https://github.com/apache/spark/pull/56371]
> Fix Literal.create for external nanosecond timestamp values
> -----------------------------------------------------------
>
> Key: SPARK-57317
> URL: https://issues.apache.org/jira/browse/SPARK-57317
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Assignee: Max Gekk
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.3.0
>
>
> Literal.create(value, dataType) produces an invalid literal when the value is
> an
> external (high-level) nanosecond timestamp value and the declared type is a
> nanosecond timestamp type (TimestampLTZNanosType / TimestampNTZNanosType), or
> a
> complex type (array/map/struct) containing one.
> For these types the method routed the value through the schema-less
> CatalystTypeConverters.convertToCatalyst, which by design (SPARK-57033) keeps
> bare java.time.Instant and java.time.LocalDateTime on the microsecond
> converters.
> As a result the produced Catalyst value is a Long (epoch micros) instead of
> the
> internal TimestampNanosVal representation expected by the declared type, and
> Literal validation fails, e.g.:
> java.lang.IllegalArgumentException: requirement failed: Literal must have
> a
> corresponding value to timestamp_ltz(7), but class Long found.
> The same problem affects collections of such values, e.g.:
> Literal must have a corresponding value to array<timestamp_ntz(9)>, but
> class
> GenericArrayData found.
> Fix: Literal.create now routes the value through the schema-driven converter
> (CatalystTypeConverters.createToCatalystConverter) when the declared type
> contains
> a nanosecond timestamp type anywhere, but only for external values. Values
> already
> in Catalyst internal form (TimestampNanosVal, ArrayData, MapData,
> InternalRow) and
> nulls keep using the lenient schema-less path, preserving the behavior of
> callers
> such as Literal.default that pass internal values.
> This gap was surfaced while adding the nanosecond timestamp types to
> DataTypeTestUtils (SPARK-57259), which drives PredicateSuite's generic
> "IN with different types" coverage over these types.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]