alamb commented on code in PR #4650: URL: https://github.com/apache/arrow-datafusion/pull/4650#discussion_r1052579740
########## benchmarks/expected-plans/q20.txt: ########## @@ -1,17 +1,17 @@ Sort: supplier.s_name ASC NULLS LAST Projection: supplier.s_name, supplier.s_address - LeftSemi Join: supplier.s_suppkey = __sq_2.ps_suppkey + LeftSemi Join: supplier.s_suppkey = __sq_1.ps_suppkey Review Comment: these changes imply the decorrelate passes used to be applied bottom up and after this PR they are applied top-down Is that intentional? Or maybe I am misreading the diff 🤔 ########## datafusion/expr/src/logical_plan/plan.rs: ########## @@ -1327,9 +1327,9 @@ impl SubqueryAlias { #[derive(Clone)] pub struct Filter { /// The predicate expression, which must have Boolean type. - predicate: Expr, + pub predicate: Expr, /// The incoming logical plan - input: Arc<LogicalPlan>, + pub input: Arc<LogicalPlan>, Review Comment: I think it is ok -- it would be nice to add some comments explaining a `Filter` should not be created directly but instead use `try_new()` and that these fields are only `pub` to support pattern matching ########## datafusion/optimizer/src/eliminate_filter.rs: ########## @@ -40,176 +41,144 @@ impl OptimizerRule for EliminateFilter { fn try_optimize( &self, plan: &LogicalPlan, - optimizer_config: &mut OptimizerConfig, + _optimizer_config: &mut OptimizerConfig, ) -> Result<Option<LogicalPlan>> { - let predicate_and_input = match plan { - LogicalPlan::Filter(filter) => match filter.predicate() { - Expr::Literal(ScalarValue::Boolean(Some(v))) => { - Some((*v, filter.input())) + match plan { + LogicalPlan::Filter(Filter { + predicate: Expr::Literal(ScalarValue::Boolean(Some(v))), + input, + }) => { Review Comment: I wonder if you could use a match guard? ```rust LogicalPlan::Filter(filter) if matches(!filter.expr(), Expr::Literal(ScalarValue::Boolean(Some(v))))) { ... ``` ########## datafusion/optimizer/src/decorrelate_where_in.rs: ########## @@ -85,43 +86,32 @@ impl OptimizerRule for DecorrelateWhereIn { ) -> Result<Option<LogicalPlan>> { match plan { LogicalPlan::Filter(filter) => { - let predicate = filter.predicate(); - let filter_input = filter.input().as_ref(); - - // Apply optimizer rule to current input - let optimized_input = self - .try_optimize(filter_input, config)? - .unwrap_or_else(|| filter_input.clone()); - let (subqueries, other_exprs) = - self.extract_subquery_exprs(predicate, config)?; - let optimized_plan = LogicalPlan::Filter(Filter::try_new( - predicate.clone(), - Arc::new(optimized_input), - )?); + self.extract_subquery_exprs(filter.predicate(), config)?; if subqueries.is_empty() { // regular filter, no subquery exists clause here - return Ok(Some(optimized_plan)); + return Ok(None); } // iterate through all exists clauses in predicate, turning each into a join - let mut cur_input = filter_input.clone(); + let mut cur_input = filter.input().as_ref().clone(); for subquery in subqueries { cur_input = optimize_where_in(&subquery, &cur_input, &other_exprs, config)?; } Ok(Some(cur_input)) } - _ => { - // Apply the optimization to all inputs of the plan - Ok(Some(utils::optimize_children(self, plan, config)?)) - } + _ => Ok(None), } } fn name(&self) -> &str { "decorrelate_where_in" } + + fn apply_order(&self) -> Option<ApplyOrder> { Review Comment: this is a really nice pattern 👨🍳 👌 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org