github-actions[bot] commented on code in PR #63763:
URL: https://github.com/apache/doris/pull/63763#discussion_r3497151609
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AggScalarSubQueryToWindowFunction.java:
##########
@@ -362,17 +510,260 @@ private Plan rewrite(LogicalFilter<? extends Plan>
filter, LogicalApply<Plan, Pl
windowFilterConjunct = ExpressionUtils.replace(windowFilterConjunct,
ImmutableMap.of(aggOut.toSlot(), aggOutExpr));
- LogicalFilter<Plan> newFilter =
filter.withConjunctsAndChild(conjuncts.get(true), apply.left());
+ // Split uncorrelated conjuncts: predicates that reference ONLY shared
+ // relation slots (tables appearing in both outer and inner plans) must
+ // stay ABOVE the window. Otherwise the window function would see a
+ // different set of rows than the original scalar subquery.
+ //
+ // For example, with fact(f) as shared table and dim(d) as outer-only:
+ // f.v > 6 → shared-only → must stay above the window
+ // d.tag > 0 → outer-only → safe below the window
+ // f.k = d.k → join cond → needed below the window
+ //
+ // We find shared tables by comparing table IDs that appear in both
+ // outer and inner plans, then collect ALL output slots of those
+ // tables (not just columns referenced in the inner query).
+ List<CatalogRelation> outerRels = outerPlans.stream()
+ .filter(CatalogRelation.class::isInstance)
+ .map(CatalogRelation.class::cast)
+ .collect(Collectors.toList());
+ List<CatalogRelation> innerRels = innerPlans.stream()
+ .filter(CatalogRelation.class::isInstance)
+ .map(CatalogRelation.class::cast)
+ .collect(Collectors.toList());
+ Set<Long> innerTableIds = innerRels.stream()
+ .map(r -> r.getTable().getId())
+ .collect(Collectors.toSet());
+ Set<ExprId> sharedOuterExprIds = outerRels.stream()
+ .filter(r -> innerTableIds.contains(r.getTable().getId()))
+ .flatMap(r -> r.getOutput().stream())
+ .map(Slot::getExprId)
+ .collect(Collectors.toSet());
+ Set<Expression> uncorrelatedConjuncts = conjuncts.get(true);
+ Set<Expression> belowWindowConjuncts = Sets.newHashSet();
+ Set<Expression> aboveWindowConjuncts = Sets.newHashSet();
+ if (uncorrelatedConjuncts != null) {
+ for (Expression conj : uncorrelatedConjuncts) {
+ // Conjuncts that were matched against inner subquery filter
+ // conjuncts (tracked by checkFilter) must stay BELOW the
+ // window. They are semantically part of the inner aggregate's
+ // filter, not extra outer-only predicates. Placing them above
+ // the window would let the window see more rows than the
+ // original scalar subquery, producing wrong aggregate results.
+ if (matchedInnerFilterConjuncts.contains(conj)) {
+ belowWindowConjuncts.add(conj);
+ continue;
+ }
+ // Volatile predicates (e.g. random() > 0.5) must stay ABOVE
+ // the window. Pushing them below would let the window
+ // aggregate over a different set of rows per partition than
+ // the original scalar subquery. PushDownFilterThroughWindow
+ // has the same hazard and explicitly rejects volatile
+ // predicates for this reason.
+ if (conj.containsVolatileExpression()) {
Review Comment:
Top-level `NoneMovableFunction` conjuncts still fall through this split. The
matched-inner guard and the nested-filter path now check
`containsType(NoneMovableFunction.class)`, but a direct outer `WHERE` predicate
such as `assert_true(d.tag > 0, 'bad')` is not volatile and references only the
outer-only table, so `hasShared` stays false and the code adds it to
`belowWindowConjuncts`. That moves the assertion from the original top filter
to below the `LogicalWindow`/join rewrite, even though `NoneMovableFunction`'s
contract says these expressions should not be moved or pruned. Please treat
`NoneMovableFunction` like the volatile case here, or reject the rewrite, and
cover the direct top-level `WHERE` case in a test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]