alamb commented on code in PR #3796:
URL: https://github.com/apache/arrow-datafusion/pull/3796#discussion_r992627413


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1148,18 +1150,72 @@ pub struct SubqueryAlias {
 #[derive(Clone)]
 pub struct Filter {
     /// The predicate expression, which must have Boolean type.
-    pub predicate: Expr,
+    predicate: Expr,
     /// The incoming logical plan
-    pub input: Arc<LogicalPlan>,
+    input: Arc<LogicalPlan>,
 }
 
 impl Filter {
+    /// Create a new filter operator.
+    pub fn try_new(
+        predicate: Expr,
+        input: Arc<LogicalPlan>,
+    ) -> datafusion_common::Result<Self> {
+        // Filter predicates must return a boolean value so we try and 
validate that here.
+        // Note that it is not always possible to resolve the predicate 
expression during plan
+        // construction (such as with correlated subqueries) so we make a best 
effort here and
+        // ignore errors resolving the expression against the schema.
+        if let Ok(predicate_type) = predicate.get_type(input.schema()) {
+            if predicate_type != DataType::Boolean {
+                return Err(DataFusionError::Plan(format!(
+                    "Cannot create filter with non-boolean predicate '{}' 
returning {}",
+                    predicate, predicate_type
+                )));
+            }
+        }
+
+        // filter predicates should not contain aliased expressions so we 
remove any aliases here

Review Comment:
   Is the rationale here that the the output of a `FilterExpr` has the same 
schema as its input -- since the Filter never changes its input if we find an 
alias on the FilterExpr then something is likely wrong as it won't change 
anything?
   
   I would personally prefer to throw an error rather than stripping the 
aliases and then track down what is adding aliases



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -1148,18 +1150,72 @@ pub struct SubqueryAlias {
 #[derive(Clone)]
 pub struct Filter {
     /// The predicate expression, which must have Boolean type.
-    pub predicate: Expr,
+    predicate: Expr,
     /// The incoming logical plan
-    pub input: Arc<LogicalPlan>,
+    input: Arc<LogicalPlan>,
 }
 
 impl Filter {
+    /// Create a new filter operator.
+    pub fn try_new(
+        predicate: Expr,
+        input: Arc<LogicalPlan>,
+    ) -> datafusion_common::Result<Self> {
+        // Filter predicates must return a boolean value so we try and 
validate that here.
+        // Note that it is not always possible to resolve the predicate 
expression during plan
+        // construction (such as with correlated subqueries) so we make a best 
effort here and
+        // ignore errors resolving the expression against the schema.
+        if let Ok(predicate_type) = predicate.get_type(input.schema()) {
+            if predicate_type != DataType::Boolean {
+                return Err(DataFusionError::Plan(format!(
+                    "Cannot create filter with non-boolean predicate '{}' 
returning {}",
+                    predicate, predicate_type
+                )));
+            }
+        }
+
+        // filter predicates should not contain aliased expressions so we 
remove any aliases here
+        // but perhaps we should be failing here instead?
+        struct RemoveAliases {}
+
+        impl ExprRewriter for RemoveAliases {

Review Comment:
   I am not sure about recursively removing all aliases (like what happens if 
this recurses into suqueries, for example 🤔 ) Removing at the top most level, 
like
   
   ```rust
   if let Expr::Alias {..} = expr {
   ..
   }
   ``` 
   
   makes more sense



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to