alamb commented on code in PR #11371: URL: https://github.com/apache/datafusion/pull/11371#discussion_r1675797038
########## datafusion/expr/src/planner.rs: ########## @@ -161,6 +162,28 @@ pub trait ExprPlanner: Send + Sync { ) -> Result<PlannerResult<Vec<Expr>>> { Ok(PlannerResult::Original(args)) } + + /// Plans a `RawAggregateUDF` based on the given input expressions. + /// + /// Returns a `PlannerResult` containing either the planned aggregate function or the original + /// input expressions if planning is not possible. + fn plan_aggregate_udf( + &self, + aggregate_function: RawAggregateUDF, + ) -> Result<PlannerResult<RawAggregateUDF>> { + Ok(PlannerResult::Original(aggregate_function)) + } +} + +// An `AggregateUDF` to be planned. +#[derive(Debug, Clone)] +pub struct RawAggregateUDF { Review Comment: I agree that Plannable sounds better (perhaps we can propose the change as a follow on PR)? What do you think @samuelcolvin and @jayzhan211 ? ########## datafusion/expr/src/planner.rs: ########## @@ -161,6 +162,28 @@ pub trait ExprPlanner: Send + Sync { ) -> Result<PlannerResult<Vec<Expr>>> { Ok(PlannerResult::Original(args)) } + + /// Plans a `RawAggregateUDF` based on the given input expressions. + /// + /// Returns a `PlannerResult` containing either the planned aggregate function or the original + /// input expressions if planning is not possible. + fn plan_aggregate_udf( + &self, + aggregate_function: RawAggregateUDF, + ) -> Result<PlannerResult<RawAggregateUDF>> { + Ok(PlannerResult::Original(aggregate_function)) Review Comment: I think the conversion from `count(1)` to `count()` is more like an optinization (rather than something for hte sql planner) as it would apply equally to SQL and to dataframe apis Perhaps we could add it as part of https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.AggregateUDFImpl.html#method.simplify 🤔 ########## datafusion/sql/src/expr/function.rs: ########## @@ -349,13 +350,31 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { .map(|e| self.sql_expr_to_logical_expr(*e, schema, planner_context)) .transpose()? .map(Box::new); - return Ok(Expr::AggregateFunction(expr::AggregateFunction::new_udf( - fm, + + let raw_aggregate_function = RawAggregateUDF { + udf, args, distinct, filter, order_by, null_treatment, + }; + + for planner in self.planners.iter() { Review Comment: this means that any planner extensions would have precidence over the built in aggregate functions from the Registry. I think that is ok -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org