alamb commented on code in PR #11371:
URL: https://github.com/apache/datafusion/pull/11371#discussion_r1675797038
##########
datafusion/expr/src/planner.rs:
##########
@@ -161,6 +162,28 @@ pub trait ExprPlanner: Send + Sync {
) -> Result<PlannerResult<Vec<Expr>>> {
Ok(PlannerResult::Original(args))
}
+
+ /// Plans a `RawAggregateUDF` based on the given input expressions.
+ ///
+ /// Returns a `PlannerResult` containing either the planned aggregate
function or the original
+ /// input expressions if planning is not possible.
+ fn plan_aggregate_udf(
+ &self,
+ aggregate_function: RawAggregateUDF,
+ ) -> Result<PlannerResult<RawAggregateUDF>> {
+ Ok(PlannerResult::Original(aggregate_function))
+ }
+}
+
+// An `AggregateUDF` to be planned.
+#[derive(Debug, Clone)]
+pub struct RawAggregateUDF {
Review Comment:
I agree that Plannable sounds better (perhaps we can propose the change as a
follow on PR)?
What do you think @samuelcolvin and @jayzhan211 ?
##########
datafusion/expr/src/planner.rs:
##########
@@ -161,6 +162,28 @@ pub trait ExprPlanner: Send + Sync {
) -> Result<PlannerResult<Vec<Expr>>> {
Ok(PlannerResult::Original(args))
}
+
+ /// Plans a `RawAggregateUDF` based on the given input expressions.
+ ///
+ /// Returns a `PlannerResult` containing either the planned aggregate
function or the original
+ /// input expressions if planning is not possible.
+ fn plan_aggregate_udf(
+ &self,
+ aggregate_function: RawAggregateUDF,
+ ) -> Result<PlannerResult<RawAggregateUDF>> {
+ Ok(PlannerResult::Original(aggregate_function))
Review Comment:
I think the conversion from `count(1)` to `count()` is more like an
optinization (rather than something for hte sql planner) as it would apply
equally to SQL and to dataframe apis
Perhaps we could add it as part of
https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.AggregateUDFImpl.html#method.simplify
🤔
##########
datafusion/sql/src/expr/function.rs:
##########
@@ -349,13 +350,31 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
.map(|e| self.sql_expr_to_logical_expr(*e, schema,
planner_context))
.transpose()?
.map(Box::new);
- return
Ok(Expr::AggregateFunction(expr::AggregateFunction::new_udf(
- fm,
+
+ let raw_aggregate_function = RawAggregateUDF {
+ udf,
args,
distinct,
filter,
order_by,
null_treatment,
+ };
+
+ for planner in self.planners.iter() {
Review Comment:
this means that any planner extensions would have precidence over the built
in aggregate functions from the Registry. I think that is ok
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]