RussellSpitzer commented on issue #4997:
URL: https://github.com/apache/iceberg/issues/4997#issuecomment-1154430977
@puchengy The example you pasted shows there is a function applied to the
column
see
```
(cast(dt#119 as date) >= 2022-06-05)
AND (dt#119 <= 2022-06-06))
```
In general this is a problem for Datasources, although in 3.2? 3.3? a rule
was added to force cast the literal side of the predicate but I think that's
still limited only to certain types of literals. Basically the issue is that if
Spark thinks a function like `cast` must be applied to a DataSource Column then
it is not allowed to push down that predicate. It can only push down predicates
where the columns are not modified in any way before being compared to a
literal.
I wrote about this a long time ago
[here](https://www.russellspitzer.com/2016/04/18/Catalyst-Debugging/)
The issue in your particular query is that _dt_ is a **string** and the
output of your function is a **date**. Spark needs to resolve this mismatch so
it casts _dt_ as a date so the types match. The types now match, but it is
impossible to pushdown the predicate. To fix it you cast the literal output as
**string** before Spark has a chance to cast _dt_. You'll notice that in your
example the other predicate is always pushed down correctly because it is a
literal string being compared to a literal column.
This has been the case in Spark for external datasources for a very long
time so I don't think it's a new issue.
In this case you fix it by casting the "literal" part of the predicate to
match the column type.
```sql
select * from tbl where dt between cast(date_add('2022-01-01, -1) as String)
and '2022-01-01'
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]