github-actions[bot] commented on code in PR #64849:
URL: https://github.com/apache/doris/pull/64849#discussion_r3473312876
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKey.java:
##########
@@ -79,28 +81,59 @@ public List<Rule> buildRules() {
}
LogicalAggregate<Plan> eliminateGroupByKey(LogicalAggregate<? extends
Plan> agg, Set<Slot> requireOutput) {
- Set<Expression> removeExpression = findCanBeRemovedExpressions(agg,
requireOutput,
+ FindResult result = findCanBeRemovedExpressionsInternal(agg,
requireOutput,
agg.child().getLogicalProperties().getTrait());
+ Set<Expression> removeExpression = result.removeExpression;
+ Set<Expression> wrapWithAnyValue = result.wrapWithAnyValue;
+
List<Expression> newGroupExpression = new ArrayList<>();
for (Expression expression : agg.getGroupByExpressions()) {
- if (!removeExpression.contains(expression)) {
+ if (!removeExpression.contains(expression)
+ && !wrapWithAnyValue.contains(expression)) {
newGroupExpression.add(expression);
}
}
List<NamedExpression> newOutput = new ArrayList<>();
for (NamedExpression expression : agg.getOutputExpressions()) {
- if (!removeExpression.contains(expression)) {
- newOutput.add(expression);
+ if (removeExpression.contains(expression)) {
+ continue;
+ }
+ if (wrapWithAnyValue.contains(expression)) {
+ // expression is FD-redundant but needed in output: wrap with
any_value
+ expression = new Alias(expression.getExprId(),
+ new AnyValue(expression.toSlot()),
expression.getName());
}
+ newOutput.add(expression);
}
return agg.withGroupByAndOutput(newGroupExpression, newOutput);
}
Review Comment:
Returning `wrapWithAnyValue` here leaks the new direct-rule behavior into
the old public API. `PushDownAggThroughJoinOnPkFk.eliminatePrimaryOutput()`
still treats this method's result as safe to remove from the pushed-down
aggregate GROUP BY and does not wrap retained outputs. For example, in a PK/FK
join where the foreign side also has `foreign.id2 -> foreign.name`, a query
grouped by `pri.id1, foreign.id2, foreign.name` and selecting `foreign.name`
will mark `foreign.name` as wrap-only here. The pushdown caller then drops it
from `minGroupBySlotList`, rewrites `pri.id1` to the foreign key, and keeps the
`foreign.name` output as a bare slot because it is not a primary-side output.
The new aggregate below the join can therefore output `foreign.name` without
grouping by it or wrapping it in `any_value`. Please keep this public helper
returning only expressions that are removable by existing external callers, or
update the PK/FK pushdown path to consume the split result and add the
same `ANY_VALUE` output rewrite.
##########
fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKeyTest.java:
##########
@@ -181,6 +181,24 @@ void testEliminateByEqual() {
&&
agg.getGroupByExpressions().get(0).toSql().equals("name")));
}
+ @Test
+ void testEliminateByPkWithOutputNeeded() throws Exception {
+ // Regression: when a group-by key (name) is FD-redundant (id -> name)
+ // but still appears in SELECT, it should be removed from group-by
+ // and wrapped with ANY_VALUE in the output.
+ addConstraint("alter table t1 add constraint pk2 primary key (id)");
+ PlanChecker.from(connectContext)
+ .analyze("select id, name, count(*) from t1 group by id, name")
+ .rewrite()
+ .printlnTree()
+ .matches(logicalAggregate().when(agg ->
+ agg.getGroupByExpressions().size() == 1
+ &&
agg.getGroupByExpressions().get(0).toSql().equals("id")
+ &&
agg.getOutputExpressions().stream().anyMatch(
+ e -> e.child(0) instanceof
org.apache.doris.nereids.trees.expressions.functions.agg.AnyValue)));
Review Comment:
This assertion can fail before it proves the new rewrite.
`getOutputExpressions()` also contains leaf `Slot` outputs such as the grouped
`id`, and `Slot` has no children while `child(0)` is a direct list access. If
the stream visits that slot before the `ANY_VALUE` alias, the matcher throws
`IndexOutOfBoundsException` instead of checking the plan. Please guard the
shape first, for example by importing `Alias`/`AnyValue` and checking `e
instanceof Alias && e.child(0) instanceof AnyValue` (ideally also checking it
is the rewritten `name` output).
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/EliminateGroupByKey.java:
##########
@@ -79,28 +81,59 @@ public List<Rule> buildRules() {
}
LogicalAggregate<Plan> eliminateGroupByKey(LogicalAggregate<? extends
Plan> agg, Set<Slot> requireOutput) {
- Set<Expression> removeExpression = findCanBeRemovedExpressions(agg,
requireOutput,
+ FindResult result = findCanBeRemovedExpressionsInternal(agg,
requireOutput,
agg.child().getLogicalProperties().getTrait());
+ Set<Expression> removeExpression = result.removeExpression;
+ Set<Expression> wrapWithAnyValue = result.wrapWithAnyValue;
+
List<Expression> newGroupExpression = new ArrayList<>();
for (Expression expression : agg.getGroupByExpressions()) {
- if (!removeExpression.contains(expression)) {
+ if (!removeExpression.contains(expression)
+ && !wrapWithAnyValue.contains(expression)) {
newGroupExpression.add(expression);
Review Comment:
This creates the exact ExprId shape that the parallel uniform-key rewrite
avoids: for a wrapped slot `name#x`, the aggregate output becomes
`any_value(name#x) AS name#x`, so the same ExprId names both the child input
slot and a new aggregate output expression. `EliminateGroupByKeyByUniform` has
a local comment explaining that `any_value(a#0) AS a#0` violates ExprId
uniqueness and can break modules such as MV rewrite, which is why that rule
allocates a fresh alias ExprId and rewrites upper references. Please follow the
same pattern here, or otherwise keep the original grouping key until the upper
references can be rewritten safely.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]