sdf-jkl commented on code in PR #18789:
URL: https://github.com/apache/datafusion/pull/18789#discussion_r2611900925
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1960,6 +1964,41 @@ impl<S: SimplifyInfo> TreeNodeRewriter for
Simplifier<'_, S> {
}))
}
+ // =======================================
+ // preimage_in_comparison
+ // =======================================
+ //
+ // For case:
+ // date_part(expr as 'YEAR') op literal
+ //
+ // Background:
+ // Datasources such as Parquet can prune partitions using simple
predicates,
+ // but they cannot do so for complex expressions.
+ // For a complex predicate like `date_part('YEAR', c1) < 2000`,
pruning is not possible.
+ // After rewriting it to `c1 < 2000-01-01`, pruning becomes
feasible.
+ Expr::BinaryExpr(BinaryExpr { left, op, right })
+ if
is_scalar_udf_expr_and_support_preimage_in_comparison_for_binary(
+ info, &left, op, &right,
+ ) =>
+ {
+ preimage_in_comparison_for_binary(info, *left, *right, op)?
+ }
Review Comment:
Actually, upon further tinkering with the code, I realized that in this
snippet `rewrite_with_preimage` takes the wrong expression as input (`Scalar`
Literal instead of the left column)
However, we also can't just use the `left` expression. The `left` expression
is a `ScalarUDFExpression`, we still need to extract the Column expression from
the udf `args`.
We actually do the extraction inside the `preimage` call and I believe it's
the most convenient place to do so.
It is because different functions have different arguments and we won't know
which one will be the column expression.
I propose changing the `preimage` function signature to:
```rust
fn preimage(
&self,
_args: &[Expr],
_lit_expr: &Expr,
_info: &dyn SimplifyInfo,
) -> Result<(Option<Interval>, Expr)>
```
But if we go this route the function loses its straightforwardness.
Basically, I'm looking for a way to extract the Column expression argument
from any `ScalarUDFExpression`, no matter what function we use.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]