alamb commented on code in PR #3796:
URL: https://github.com/apache/arrow-datafusion/pull/3796#discussion_r992627413
##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1148,18 +1150,72 @@ pub struct SubqueryAlias {
#[derive(Clone)]
pub struct Filter {
/// The predicate expression, which must have Boolean type.
- pub predicate: Expr,
+ predicate: Expr,
/// The incoming logical plan
- pub input: Arc<LogicalPlan>,
+ input: Arc<LogicalPlan>,
}
impl Filter {
+ /// Create a new filter operator.
+ pub fn try_new(
+ predicate: Expr,
+ input: Arc<LogicalPlan>,
+ ) -> datafusion_common::Result<Self> {
+ // Filter predicates must return a boolean value so we try and
validate that here.
+ // Note that it is not always possible to resolve the predicate
expression during plan
+ // construction (such as with correlated subqueries) so we make a best
effort here and
+ // ignore errors resolving the expression against the schema.
+ if let Ok(predicate_type) = predicate.get_type(input.schema()) {
+ if predicate_type != DataType::Boolean {
+ return Err(DataFusionError::Plan(format!(
+ "Cannot create filter with non-boolean predicate '{}'
returning {}",
+ predicate, predicate_type
+ )));
+ }
+ }
+
+ // filter predicates should not contain aliased expressions so we
remove any aliases here
Review Comment:
Is the rationale here that the the output of a `FilterExpr` has the same
schema as its input -- since the Filter never changes its input if we find an
alias on the FilterExpr then something is likely wrong as it won't change
anything?
I would personally prefer to throw an error rather than stripping the
aliases and then track down what is adding aliases
##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1148,18 +1150,72 @@ pub struct SubqueryAlias {
#[derive(Clone)]
pub struct Filter {
/// The predicate expression, which must have Boolean type.
- pub predicate: Expr,
+ predicate: Expr,
/// The incoming logical plan
- pub input: Arc<LogicalPlan>,
+ input: Arc<LogicalPlan>,
}
impl Filter {
+ /// Create a new filter operator.
+ pub fn try_new(
+ predicate: Expr,
+ input: Arc<LogicalPlan>,
+ ) -> datafusion_common::Result<Self> {
+ // Filter predicates must return a boolean value so we try and
validate that here.
+ // Note that it is not always possible to resolve the predicate
expression during plan
+ // construction (such as with correlated subqueries) so we make a best
effort here and
+ // ignore errors resolving the expression against the schema.
+ if let Ok(predicate_type) = predicate.get_type(input.schema()) {
+ if predicate_type != DataType::Boolean {
+ return Err(DataFusionError::Plan(format!(
+ "Cannot create filter with non-boolean predicate '{}'
returning {}",
+ predicate, predicate_type
+ )));
+ }
+ }
+
+ // filter predicates should not contain aliased expressions so we
remove any aliases here
+ // but perhaps we should be failing here instead?
+ struct RemoveAliases {}
+
+ impl ExprRewriter for RemoveAliases {
Review Comment:
I am not sure about recursively removing all aliases (like what happens if
this recurses into suqueries, for example 🤔 ) Removing at the top most level,
like
```rust
if let Expr::Alias {..} = expr {
..
}
```
makes more sense
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]