ClSlaid commented on code in PR #10460:
URL: https://github.com/apache/datafusion/pull/10460#discussion_r1597542555
##########
datafusion/optimizer/src/replace_distinct_aggregate.rs:
##########
@@ -88,60 +94,72 @@ impl OptimizerRule for ReplaceDistinctWithAggregate {
input,
schema,
})) => {
+ let expr_cnt = on_expr.len();
+
// Construct the aggregation expression to be used to fetch
the selected expressions.
- let aggr_expr = select_expr
- .iter()
- .map(|e| {
- Expr::AggregateFunction(AggregateFunction::new(
- AggregateFunctionFunc::FirstValue,
- vec![e.clone()],
- false,
- None,
- sort_expr.clone(),
- None,
- ))
- })
- .collect::<Vec<Expr>>();
+ let aggr_expr =
vec![Expr::AggregateFunction(AggregateFunction::new(
+ AggregateFunctionFunc::FirstValue,
+ select_expr,
+ false,
+ None,
+ sort_expr.clone(),
+ None,
+ ))];
+
+ let aggr_expr = normalize_cols(aggr_expr, input.as_ref())?;
+ let group_expr = normalize_cols(on_expr, input.as_ref())?;
// Build the aggregation plan
- let plan = LogicalPlanBuilder::from(input.as_ref().clone())
- .aggregate(on_expr.clone(), aggr_expr.to_vec())?
- .build()?;
+ let plan = LogicalPlan::Aggregate(Aggregate::try_new(
+ input, group_expr, aggr_expr,
+ )?);
+ let lpb = LogicalPlanBuilder::from(plan);
Review Comment:
The context is that, I extract this logic totally from
`LogicalPlanBuilder::aggregate`. It might introduce a problem that if
`LogicalPlanBuilder::aggregate` changes in the future, the code here stays
still and frustrates people to find and rewrite.
The reason why I extract the logic, is that the `LogicalPlanBuilder` accepts
a `LogicalPlan`. So that when you'd like to build an aggregate plan with
`LogicalPlanBuilder`, it will go through:
```rust
// the builder accepts LogicalPlan only, so a clone needed
let mut builder =
LogicalPlanBuilder::from(Arc<LogicalPlan>.as_ref().clone());
// then you get:
builder == LogicalPlanBuilder { LogicalPlan };
// making an aggregate plan
builder.aggregate(..);
// this will call Arc::new() on the LogicalPlan contained.
// another clone here
LogicalPlanBuilder { AggrPlan { Arc<LogicalPlan>, ... } }
```
Since many of our `LogicalPlan`s could contain an `Arc<LogicalPlan>`, and
for such occasion, I wonder if `Into<Arc<LogicalPlan>>` could save clones,
while still allows us to use `LogicalPlanBuilder`, instead of extracting logic.
```rust
struct LogicalPlanBuilder<PB> {
plan: PB,
}
impl<PB: Into<Arc<LogicalPlan>>> From<PB> for LogicalPlanBuilder<PB> {
// ...
}
impl LogicalPlan<PB: Into<Arc<LogicalPlan>>> {
fn aggregate(self, group_exprs, aggr_exprs) -> Self {
// ...
AggrPlan::try_new(
// ...
self.plan.into(), // input: Arc<LogicalPlan>
// ...
);
// ...
}
}
```
Because `Arc<T>` has `From<T>` implemented, so when we need `Arc::new`, just
call `into` on `self.plan`, this should save times of clones.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]