[ 
https://issues.apache.org/jira/browse/SPARK-46502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-46502:
-------------------------------------

    Assignee: L. C. Hsieh

> Support timestamp types in UnwrapCastInBinaryComparison
> -------------------------------------------------------
>
>                 Key: SPARK-46502
>                 URL: https://issues.apache.org/jira/browse/SPARK-46502
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>              Labels: pull-request-available
>
> We have an optimization rule `UnwrapCastInBinaryComparison` that handles 
> similar cases but it doesn't cover timestamp types.
> For a query plan like
> ```
> == Analyzed Logical Plan ==
> batch: timestamp
> Project [batch#26466]
> +- Filter (batch#26466 >= cast(2023-12-21 10:00:00 as timestamp))
>    +- SubqueryAlias spark_catalog.default.timestamp_view
>       +- View (`spark_catalog`.`default`.`timestamp_view`, [batch#26466])
>          +- Project [cast(batch#26467 as timestamp) AS batch#26466]
>             +- Project [cast(batch#26463 as timestamp) AS batch#26467]
>                +- SubqueryAlias spark_catalog.default.table_timestamp
>                   +- Relation 
> spark_catalog.default.table_timestamp[batch#26463] parquet
> == Optimized Logical Plan ==
> Project [cast(batch#26463 as timestamp) AS batch#26466]
> +- Filter (isnotnull(batch#26463) AND (cast(batch#26463 as timestamp) >= 
> 2023-12-21 10:00:00))
>    +- Relation spark_catalog.default.table_timestamp[batch#26463] parquet
> ```
> The predicate compares a timestamp_ntz column with a literal value. As the 
> column is wrapped in a cast expression to timestamp type, the literal 
> (string) is wrapped with a cast to timestamp type. The literal with cast is 
> foldable so it is evaluated to literal of timestamp earlier. So the predicate 
> becomes `cast(batch#26463 as timestamp) >= 2023-12-21 10:00:00`. As the cast 
> is in column side, it cannot be pushed down to data source/table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to