xiangfu0 commented on code in PR #18513:
URL: https://github.com/apache/pinot/pull/18513#discussion_r3281051207


##########
pinot-query-planner/src/main/java/org/apache/pinot/calcite/rel/rules/PinotQueryRuleSets.java:
##########
@@ -141,9 +141,21 @@ private PinotQueryRuleSets() {
       // push aggregate functions through join, disabled by default
       AggregateJoinTransposeRule.Config.EXTENDED
           
.withDescription(PlannerRuleNames.AGGREGATE_JOIN_TRANSPOSE_EXTENDED).toRule(),
-      // aggregate union rule
+      // AggregateUnionAggregate and AggregateUnionTranspose are inverses of 
each other and the defaults are chosen to
+      // optimize for distributed-OLAP cost (shuffle dominates compute):
+      //   - AggregateUnionTranspose (default-on) pushes the aggregate into 
every UNION ALL branch, so each branch
+      //     ships pre-aggregated rows across the next exchange instead of raw 
rows. With low-cardinality group keys
+      //     (the common analytics case) this trades a cheap extra per-branch 
aggregate for a much smaller shuffle.
+      //   - AggregateUnionAggregate (default-off) does the opposite: it 
collapses Agg(Union(Agg(A), B)) into
+      //     Agg(Union(A, B)), which loses the pre-aggregation on A and forces 
raw rows through the union's exchange.
+      //     We keep it available behind `usePlannerRules` for 
embedded/in-memory deployments where shuffle is free,
+      //     but enabling it alongside the default-on Transpose just undoes 
Transpose's work (see the order below).
       AggregateUnionAggregateRule.Config.DEFAULT
           
.withDescription(PlannerRuleNames.AGGREGATE_UNION_AGGREGATE).toRule(),
+      // Registered after AggregateUnionAggregate on purpose: if both are 
enabled, this phase runs last so Transpose
+      // wins and any merge AggregateUnionAggregate did first gets pushed back 
into the branches.
+      PinotAggregateUnionTransposeRule

Review Comment:
   Registering the new default-on transpose here changes the behavior of the 
existing `usePlannerRules='AggregateUnionAggregate'` opt-in. That rule can 
still fire, but this phase immediately reconstructs the branch-local aggregates 
again unless callers also discover they need 
`skipPlannerRules='AggregateUnionTranspose'`. That is a backward-incompatible 
change to an existing query option; please preserve the old opt-in contract or 
add a compatibility path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to