github-actions[bot] commented on code in PR #64820:
URL: https://github.com/apache/doris/pull/64820#discussion_r3524117855


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java:
##########
@@ -143,14 +141,17 @@ public Plan visitLogicalAggregate(LogicalAggregate<? 
extends Plan> agg, JobConte
                     if (aggFunction.containsVolatileExpression()) {
                         return agg;
                     }
-                    // CaseWhen and If (which CASE WHEN is normalized into) 
must both be checked.
-                    // When an agg function contains an If/CaseWhen whose 
condition tests IS NULL
-                    // (e.g. count(if(col IS NULL, value, NULL))), pushing it 
to the nullable side
-                    // of an outer join produces wrong results: null-extended 
rows make "col IS NULL"
-                    // TRUE at the top level, but the pre-aggregated count 
slot becomes NULL after
-                    // null-extension, and ifnull(sum(NULL), 0) = 0 instead of 
the correct 1.
-                    if (!hasCaseWhen && aggFunction.anyMatch(e -> e instanceof 
CaseWhen || e instanceof If)) {
-                        hasCaseWhen = true;
+                    // NullToNonNullFunction / AlwaysNotNullable: expressions 
that can convert NULL
+                    // input to non-NULL output (e.g. COALESCE, NVL, IF, CASE 
WHEN, Array).
+                    // When an agg function contains such an expression 
wrapping a column from the
+                    // nullable side of an outer join, null-extended rows 
would produce non-NULL values
+                    // that get counted by the aggregation. But the 
pre-aggregation on the base table
+                    // cannot see null-extended rows (they are produced by the 
join), so the push-down
+                    // would lose those contributions — producing wrong 
results.
+                    if (!containsNullToNonNull
+                            && aggFunction.children().anyMatch(

Review Comment:
   `children()` returns `java.util.List<Expression>`, so this new call does not 
compile; the same pattern was added in 
`EagerAggRewriter.createContextFromProject`. Please use a stream or helper at 
both call sites. Also, do not stop at `children().stream().anyMatch(...)`: that 
would only inspect the direct aggregate arguments, so wrappers like 
`count(cast(coalesce(t2.name, 'x') as string))` would see only `Cast` here and 
miss the nested `Coalesce`, allowing the nullable-side outer-join pushdown that 
this PR is trying to prevent. A safe shape is to iterate the aggregate 
arguments but recurse inside each argument, e.g. 
`aggFunction.children().stream().anyMatch(arg -> arg.anyMatch(e -> 
NullToNonNullFunction.canConvertNullToNonNull((Expression) e)))`, so the 
aggregate function node itself is skipped while nested null-to-non-null 
expressions are still detected.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/EagerAggRewriter.java:
##########
@@ -383,21 +383,22 @@ private PushDownAggContext createContextFromProject(
             aggFunctions.add(newAggFunc);
         }
         // After pushing expressions past the project, the agg functions may 
now
-        // contain If/CaseWhen that were hidden behind slot references before.
-        // e.g. count(#slot) where #slot = if(cond, a, b) in the project.
-        // We must re-check and update hasCaseWhen accordingly.
-        boolean newHasCaseWhen = context.hasCaseWhen;
-        if (!newHasCaseWhen) {
+        // contain NullToNonNull expressions that were hidden behind slot 
references before.
+        // e.g. count(#slot) where #slot = coalesce(a, 0) in the project.
+        // We must re-check and update containsNullToNonNull accordingly.
+        boolean newContainsNullToNonNull = context.containsNullToNonNull;
+        if (!newContainsNullToNonNull) {
             for (AggregateFunction aggFunc : aggFunctions) {
-                if (aggFunc.anyMatch(e -> e instanceof CaseWhen || e 
instanceof If)) {
-                    newHasCaseWhen = true;
+                if (aggFunc.children().anyMatch(

Review Comment:
   Same issue as in `PushDownAggregation`: `children()` returns a `List`, so 
this does not compile. When fixing it, please keep the traversal recursive 
under each aggregate argument rather than only checking direct children; after 
project substitution, wrappers around `NullToNonNullFunction` expressions still 
need to block nullable-side outer-join pushdown.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to