[ 
https://issues.apache.org/jira/browse/SPARK-57317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57317:
-----------------------------------
    Labels: pull-request-available  (was: )

> Fix Literal.create for external nanosecond timestamp values
> -----------------------------------------------------------
>
>                 Key: SPARK-57317
>                 URL: https://issues.apache.org/jira/browse/SPARK-57317
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Major
>              Labels: pull-request-available
>
> Literal.create(value, dataType) produces an invalid literal when the value is 
> an
> external (high-level) nanosecond timestamp value and the declared type is a
> nanosecond timestamp type (TimestampLTZNanosType / TimestampNTZNanosType), or 
> a
> complex type (array/map/struct) containing one.
> For these types the method routed the value through the schema-less
> CatalystTypeConverters.convertToCatalyst, which by design (SPARK-57033) keeps
> bare java.time.Instant and java.time.LocalDateTime on the microsecond 
> converters.
> As a result the produced Catalyst value is a Long (epoch micros) instead of 
> the
> internal TimestampNanosVal representation expected by the declared type, and
> Literal validation fails, e.g.:
>     java.lang.IllegalArgumentException: requirement failed: Literal must have 
> a
>     corresponding value to timestamp_ltz(7), but class Long found.
> The same problem affects collections of such values, e.g.:
>     Literal must have a corresponding value to array<timestamp_ntz(9)>, but 
> class
>     GenericArrayData found.
> Fix: Literal.create now routes the value through the schema-driven converter
> (CatalystTypeConverters.createToCatalystConverter) when the declared type 
> contains
> a nanosecond timestamp type anywhere, but only for external values. Values 
> already
> in Catalyst internal form (TimestampNanosVal, ArrayData, MapData, 
> InternalRow) and
> nulls keep using the lenient schema-less path, preserving the behavior of 
> callers
> such as Literal.default that pass internal values.
> This gap was surfaced while adding the nanosecond timestamp types to
> DataTypeTestUtils (SPARK-57259), which drives PredicateSuite's generic
> "IN with different types" coverage over these types.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to