RussellSpitzer commented on issue #4997:
URL: https://github.com/apache/iceberg/issues/4997#issuecomment-1154430977

   @puchengy The example you pasted shows there is a function applied to the 
column
   
   see
   
   ```
   (cast(dt#119 as date) >= 2022-06-05) 
    AND (dt#119 <= 2022-06-06))
   ```
   
   In general this is a problem for Datasources, although in 3.2? 3.3? a rule 
was added to force cast the literal side of the predicate but I think that's 
still limited only to certain types of literals. Basically the issue is that if 
Spark thinks a function like `cast` must be applied to a DataSource Column then 
it is not allowed to push down that predicate. It can only push down predicates 
where the columns are not modified in any way before being compared to a 
literal.
   
   I wrote about this a long time ago 
[here](https://www.russellspitzer.com/2016/04/18/Catalyst-Debugging/)
   
   The issue in your particular query is that _dt_ is a **string** and the 
output of your function is a **date**. Spark needs to resolve this mismatch so 
it casts _dt_ as a date so the types match. The types now match, but it is 
impossible to pushdown the predicate. To fix it you cast the literal output as 
**string** before Spark has a chance to cast _dt_.  You'll notice that in your 
example the other predicate is always pushed down correctly because it is a 
literal string being compared to a literal column.
   
   This has been the case in Spark for external datasources for a very long 
time so I don't think it's a new issue.
   
   In this case you fix it by casting the "literal" part of the predicate to 
match the column type.
   ```sql
   select * from tbl where dt between cast(date_add('2022-01-01, -1) as String) 
and '2022-01-01'
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to