github-actions[bot] commented on code in PR #64330:
URL: https://github.com/apache/doris/pull/64330#discussion_r3381697793


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PullUpProjectExprUnderTopN.java:
##########
@@ -534,8 +556,34 @@ private static LogicalProject<? extends Plan> 
simplifyProject(
             if (!pulledUpExprIds.contains(ne.getExprId())
                     && !isUnavailablePullUpSlot(ne, context, 
childOutputExprIds)) {
                 NamedExpression resolved = resolveNamedExpression(ne, context, 
childOutputExprIds);
-                simplified.add(resolved);
-                existingExprIds.add(resolved.getExprId());
+
+                // If the resolved expression can be provided by the upper
+                // project (via chain resolution: v1#4 → SlotRef#10 →
+                // ElementAt(sa#1,'fa')), skip it in the lower project to
+                // avoid duplicate computation. Only keep the base slots
+                // so they pass through the TopN for lazy materialization.
+                boolean pulledAbove = false;
+                if (resolved != ne) {
+                    Expression replaceExpr = getPullUpReplaceExpression(
+                            resolved.toSlot(), context);

Review Comment:
   This can remove a slot that TopN still needs for ordering. `collectFromNode` 
now records every `Alias(slot)` chain even when the alias ExprId is blocked, 
for example when it is a TopN order key. A concrete failing shape is 
`TopN(order by y) -> Project(y = x) -> Project(x = a + 1) -> Scan`: the 
collector records `y -> x` and `x -> a + 1`; this branch then treats `y` as 
pulled above, keeps only `a` below TopN, and leaves the TopN order key 
referencing `y` even though its child no longer outputs `y`. That reintroduces 
the slot-not-from-child failure, or worse changes ordering if a later rule 
rewrites around it. Please keep blocked aliases below TopN, or avoid 
registering/using chain replacements for blocked ExprIds.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PullUpProjectExprUnderTopN.java:
##########
@@ -634,17 +704,28 @@ private static LogicalProject<Plan> 
addUpperProject(LogicalTopN topN, PullUpInfo
                         addPassThroughSlots(upperOutput, upperOutputExprIds, 
passThroughOutputExprIds,
                                 currentOutputByExprId, passThroughSlots);
                     } else {
-                        // Slot was lost during simplifyProject — pass through 
directly.
-                        // TopN is a pass-through node; the computation for 
this slot
-                        // exists below the TopN even if the intermediate 
project lost it.
-                        if (upperOutputExprIds.add(origSlot.getExprId())) {
-                            upperOutput.add(origSlot);
-                        }
+                        // When NestedColumnPruning has run before this rule, 
the
+                        // originalTopNOutput may contain intermediate 
expression
+                        // slots (element_at results) that were simplified 
away by
+                        // simplifyProject. These slots are no longer produced 
by
+                        // the rewritten TopN's child and would cause 
CheckAfterRewrite
+                        // failures if passed through. Skip them — the 
corresponding
+                        // computation is already covered by the pulled-up 
Aliases
+                        // or by the base slots added during simplification.
                     }
                 }
             }

Review Comment:
   `addUpperProject` should preserve the original TopN output, but this loop 
appends implementation-only base slots introduced below TopN. In the added 
test, the original TopN output is `[v1, v2, v3, idA, idB]`; after 
simplification the current TopN output also contains base slots such as `a` and 
`c`, so this upper project exposes extra columns. That can leak from a 
root/subquery/CTE subtree and also hides the regression because the test only 
checks that `v1` and `v3` exist, not that the output list is unchanged. The 
base slots need to remain inputs used by the upper project, not become 
additional outputs unless a parent explicitly requires them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to