ClSlaid commented on code in PR #10460:
URL: https://github.com/apache/datafusion/pull/10460#discussion_r1597542555


##########
datafusion/optimizer/src/replace_distinct_aggregate.rs:
##########
@@ -88,60 +94,72 @@ impl OptimizerRule for ReplaceDistinctWithAggregate {
                 input,
                 schema,
             })) => {
+                let expr_cnt = on_expr.len();
+
                 // Construct the aggregation expression to be used to fetch 
the selected expressions.
-                let aggr_expr = select_expr
-                    .iter()
-                    .map(|e| {
-                        Expr::AggregateFunction(AggregateFunction::new(
-                            AggregateFunctionFunc::FirstValue,
-                            vec![e.clone()],
-                            false,
-                            None,
-                            sort_expr.clone(),
-                            None,
-                        ))
-                    })
-                    .collect::<Vec<Expr>>();
+                let aggr_expr = 
vec![Expr::AggregateFunction(AggregateFunction::new(
+                    AggregateFunctionFunc::FirstValue,
+                    select_expr,
+                    false,
+                    None,
+                    sort_expr.clone(),
+                    None,
+                ))];
+
+                let aggr_expr = normalize_cols(aggr_expr, input.as_ref())?;
+                let group_expr = normalize_cols(on_expr, input.as_ref())?;
 
                 // Build the aggregation plan
-                let plan = LogicalPlanBuilder::from(input.as_ref().clone())
-                    .aggregate(on_expr.clone(), aggr_expr.to_vec())?
-                    .build()?;
+                let plan = LogicalPlan::Aggregate(Aggregate::try_new(
+                    input, group_expr, aggr_expr,
+                )?);
+                let lpb = LogicalPlanBuilder::from(plan);

Review Comment:
   The context is that, I extract this logic totally from 
`LogicalPlanBuilder::aggregate`. It might introduce a problem that if 
`LogicalPlanBuilder::aggregate` changes in the future, the code here stays 
still and frustrates people to find and rewrite.
   
   The reason why I extract the logic, is that the `LogicalPlanBuilder` accepts 
a `LogicalPlan`. So that when you'd like to build an aggregate plan with 
`LogicalPlanBuilder`, it will go through:
   
   ```rust
   // the builder accepts LogicalPlan only, so a clone needed
   let mut builder = 
LogicalPlanBuilder::from(Arc<LogicalPlan>.as_ref().clone());
   // then you get:
   builder == LogicalPlanBuilder { LogicalPlan };
   
   // making an aggregate plan
   builder.aggregate(..);
   // this will call Arc::new() on the LogicalPlan contained.
   // another clone here
   LogicalPlanBuilder { AggrPlan { Arc<LogicalPlan>, ... } }
   ```
   
   Since many of our `LogicalPlan`s could contain an `Arc<LogicalPlan>`, and 
for such occasion, I wonder if `Into<Arc<LogicalPlan>>` could save clones, 
while still allows us to use `LogicalPlanBuilder`, instead of extracting logic.
   
   ```rust
   struct LogicalPlanBuilder<LP> {
       plan: LP,
   }
   impl<LP: Into<Arc<LogicalPlan>>> From<LP> for LogicalPlanBuilder<LP> {
       // ...
   }
   impl LogicalPlan<LP: Into<Arc<LogicalPlan>>> {
       fn aggregate(self, group_exprs, aggr_exprs) -> Self {
           // ...
           AggrPlan::try_new(
               // ...
               self.plan.into(), // input: Arc<LogicalPlan>
               // ...
           );
           // ...
       }
   }
   ```
   Because `Arc<T>` has `From<T>` implemented, so when we need `Arc::new`, just 
call `into` on `self.plan`, this should save times of clones.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to