Copilot commented on code in PR #57741:
URL: https://github.com/apache/doris/pull/57741#discussion_r2522712222


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/MaterializedViewExprReplacer.java:
##########
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.rules.exploration.mv;
+
+import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.VirtualSlotReference;
+import 
org.apache.doris.nereids.trees.expressions.functions.scalar.GroupingScalarFunction;
+import 
org.apache.doris.nereids.trees.expressions.visitor.DefaultExpressionRewriter;
+import org.apache.doris.nereids.trees.plans.Plan;
+import org.apache.doris.nereids.trees.plans.algebra.Repeat;
+import org.apache.doris.nereids.util.ExpressionUtils;
+
+import java.util.ArrayList;
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+/**
+ * Expression replacer which support to handle VirtualSlotReference, if 
replaceMap contained, just use it
+ */
+public class MaterializedViewExprReplacer extends 
DefaultExpressionRewriter<Void> {
+
+    private final Map<? extends Expression, ? extends Expression> replaceMap;
+    private final Plan queryTopPlan;
+    private final BitSet sourcePlanBitSet;
+    private boolean valid = true;
+
+    public MaterializedViewExprReplacer(
+            Map<? extends Expression, ? extends Expression> replaceMap,
+            Plan queryTopPlan,
+            BitSet sourcePlanBitSet) {
+        this.replaceMap = replaceMap;
+        this.queryTopPlan = queryTopPlan;
+        this.sourcePlanBitSet = sourcePlanBitSet;
+    }
+
+    public boolean isValid() {
+        return valid;
+    }
+
+    @Override
+    public Expression visit(Expression expr, Void context) {
+        if (!valid) {
+            return expr;
+        }
+        Expression replacedExpr = replaceMap.get(expr);
+        if (replacedExpr != null) {
+            return replacedExpr;
+        }
+        List<Expression> newChildren = new ArrayList<>();
+        boolean hasChanged = false;
+        for (Expression child : expr.children()) {
+            Expression newChild = child.accept(this, context);
+            if (!valid) {
+                return expr;
+            }
+            if (newChild != child) {
+                hasChanged = true;
+            }
+            newChildren.add(newChild);
+        }
+        return hasChanged ? expr.withChildren(newChildren) : expr;
+    }
+
+    @Override
+    public Expression visitSlot(Slot slot, Void context) {
+        if (!valid) {
+            return slot;
+        }
+        if (slot instanceof VirtualSlotReference) {
+            return handleVirtualSlot((VirtualSlotReference) slot, context);
+        }
+        Expression replacedExpr = replaceMap.get(slot);
+        if (replacedExpr != null) {
+            return replacedExpr;
+        }
+        valid = false;
+        return slot;
+    }
+

Review Comment:
   Missing JavaDoc for the `handleVirtualSlot` method. This is a critical 
method that implements the core logic for handling VirtualSlotReference. It 
should document:
   - The purpose of the method
   - What happens when originExpression is not present vs. present
   - The role of `shuttleExpressionWithLineage` in the context
   - Return value semantics
   
   Suggested addition:
   ```java
   /**
    * Handles VirtualSlotReference replacement for grouping sets.
    * 
    * @param virtualSlot the VirtualSlotReference to handle
    * @param context the rewrite context (currently unused)
    * @return a new VirtualSlotReference with the grouping function replaced 
according to the replaceMap,
    *         or a virtual grouping ID slot if no origin expression exists
    */
   ```
   ```suggestion
   
       /**
        * Handles VirtualSlotReference replacement for grouping sets.
        * <p>
        * If the VirtualSlotReference has an origin grouping function, rewrites 
it using lineage information
        * and returns a new virtual slot based on the rewritten function. If no 
origin expression exists,
        * returns a virtual grouping ID slot.
        * <p>
        * The method uses {@code ExpressionUtils.shuttleExpressionWithLineage} 
to rewrite the grouping function
        * according to the query plan and source plan bitset.
        *
        * @param virtualSlot the VirtualSlotReference to handle
        * @param context the rewrite context (currently unused)
        * @return a new VirtualSlotReference with the grouping function 
replaced according to the replaceMap,
        *         or a virtual grouping ID slot if no origin expression exists
        */
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/MaterializedViewExprReplacer.java:
##########
@@ -0,0 +1,114 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.rules.exploration.mv;
+
+import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.VirtualSlotReference;
+import 
org.apache.doris.nereids.trees.expressions.functions.scalar.GroupingScalarFunction;
+import 
org.apache.doris.nereids.trees.expressions.visitor.DefaultExpressionRewriter;
+import org.apache.doris.nereids.trees.plans.Plan;
+import org.apache.doris.nereids.trees.plans.algebra.Repeat;
+import org.apache.doris.nereids.util.ExpressionUtils;
+
+import java.util.ArrayList;
+import java.util.BitSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+
+/**
+ * Expression replacer which support to handle VirtualSlotReference, if 
replaceMap contained, just use it

Review Comment:
   The class-level documentation comment should be more descriptive. It should 
explain:
   1. What problem this class solves (handling VirtualSlotReference in 
expression replacement for MV rewrite)
   2. The context in which it's used (materialized view query rewriting with 
group sets/CTEs)
   3. The key behaviors, especially the `valid` flag mechanism
   
   Current comment: "Expression replacer which support to handle 
VirtualSlotReference, if replaceMap contained, just use it"
   
   Suggested improvement:
   ```java
   /**
    * Expression replacer for materialized view query rewriting that handles 
VirtualSlotReference.
    * This replacer is used when rewriting queries with GROUPING SETS and CTEs 
against materialized views.
    * 
    * <p>For VirtualSlotReference (used in group sets with grouping functions), 
this replacer:
    * - Uses the replaceMap if the VirtualSlotReference is contained
    * - Otherwise, reconstructs the virtual slot by shuttling the origin 
GroupingScalarFunction
    *   through the query plan lineage and applying the replacement recursively
    * 
    * <p>The replacer maintains a `valid` flag that becomes false if any 
expression cannot be 
    * successfully replaced, allowing callers to detect incomplete rewrites.
    */
   ```
   ```suggestion
    * Expression replacer for materialized view query rewriting that handles 
VirtualSlotReference.
    * This replacer is used when rewriting queries with GROUPING SETS and CTEs 
against materialized views.
    * 
    * <p>For VirtualSlotReference (used in group sets with grouping functions), 
this replacer:
    * - Uses the replaceMap if the VirtualSlotReference is contained
    * - Otherwise, reconstructs the virtual slot by shuttling the origin 
GroupingScalarFunction
    *   through the query plan lineage and applying the replacement recursively
    * 
    * <p>The replacer maintains a {@code valid} flag that becomes false if any 
expression cannot be 
    * successfully replaced, allowing callers to detect incomplete rewrites.
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewAggregateRule.java:
##########
@@ -765,6 +767,8 @@ public Expression visitSlot(Slot slot, 
AggregateExpressionRewriteContext rewrite
                     return Repeat.generateVirtualGroupingIdSlot();
                 } else {
                     GroupingScalarFunction groupingScalarFunction = 
originExpression.get();

Review Comment:
   Missing explanation for why `shuttleExpressionWithLineage` is needed here. 
This code change adds lineage shuttling for the grouping scalar function before 
replacement, which is a key part of fixing the CTE + group sets issue.
   
   Consider adding a comment:
   ```java
   // Shuttle the grouping function through the query plan to resolve references
   // correctly when CTEs are involved, ensuring proper slot mapping
   groupingScalarFunction = (GroupingScalarFunction) 
ExpressionUtils.shuttleExpressionWithLineage(
           groupingScalarFunction, rewriteContext.getQueryTopPlan(), new 
BitSet());
   ```
   ```suggestion
                       GroupingScalarFunction groupingScalarFunction = 
originExpression.get();
                       // Shuttle the grouping function through the query plan 
to resolve references
                       // correctly when CTEs are involved, ensuring proper 
slot mapping.
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewAggregateRule.java:
##########
@@ -244,7 +244,9 @@ protected LogicalAggregate<Plan> aggregateRewriteByView(
         if (queryAggregate.getSourceRepeat().isPresent()) {
             // construct group sets for repeat
             List<List<Expression>> rewrittenGroupSetsExpressions = new 
ArrayList<>();

Review Comment:
   This change modifies how grouping sets are retrieved for queries with CTEs. 
While the logic appears correct (preferring the actual LogicalRepeat node if 
found via collectFirst, then falling back to sourceRepeat), it lacks an 
explanatory comment about why this change is necessary.
   
   Consider adding a comment explaining the issue this fixes:
   ```java
   // When queries use both CTEs and group sets, the sourceRepeat may not 
reflect
   // the actual LogicalRepeat node in the plan tree. Collect the LogicalRepeat 
directly
   // from the plan to get the correct grouping sets, falling back to 
sourceRepeat if needed.
   List<List<Expression>> groupingSets = 
queryAggregate.collectFirst(LogicalRepeat.class::isInstance)
           .map(repeat -> ((LogicalRepeat<? extends Plan>) 
repeat).getGroupingSets())
           .orElse(queryAggregate.getSourceRepeat().get().getGroupingSets());
   ```
   
   This would help future maintainers understand the fix for the CTE + group 
sets issue mentioned in the PR description.
   ```suggestion
               List<List<Expression>> rewrittenGroupSetsExpressions = new 
ArrayList<>();
               // When queries use both CTEs and grouping sets, the 
sourceRepeat may not reflect
               // the actual LogicalRepeat node in the plan tree. Collect the 
LogicalRepeat directly
               // from the plan to get the correct grouping sets, falling back 
to sourceRepeat if needed.
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewRule.java:
##########
@@ -423,7 +422,7 @@ protected List<Plan> doRewrite(StructInfo queryStructInfo, 
CascadesContext casca
                                 rewrittenPlanOutput, queryPlan.getOutput()));
                 continue;
             }
-            // Merge project
+            // Merge project if contains mult same project

Review Comment:
   The comment on line 425 has a grammatical issue. It says "Merge project if 
contains mult same project" which should be "Merge project if it contains 
multiple identical projects" or similar.
   
   Current: `// Merge project if contains mult same project`
   Suggested: `// Merge project if it contains multiple identical projects`
   ```suggestion
               // Merge project if it contains multiple identical projects
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewRule.java:
##########
@@ -641,11 +640,14 @@ protected List<Expression> rewriteExpression(List<? 
extends Expression> sourceEx
                     expressionShuttledToRewrite.collectToSet(expression -> 
expression instanceof SlotReference
                             && ((SlotReference) expression).getDataType() 
instanceof VariantType);
             extendMappingByVariant(variants, 
targetToTargetReplacementMappingQueryBased);
-            Expression replacedExpression = 
ExpressionUtils.replace(expressionShuttledToRewrite,
-                    targetToTargetReplacementMappingQueryBased);
-            Set<Expression> replacedExpressionSlotQueryUsed = 
replacedExpression.collect(slotsToRewrite::contains);
-            if (!replacedExpressionSlotQueryUsed.isEmpty()) {
-                // if contains any slot to rewrite, which means could not be 
rewritten by target,
+
+            MaterializedViewExprReplacer materializedViewExprReplacer = new 
MaterializedViewExprReplacer(
+                    targetToTargetReplacementMappingQueryBased,
+                    sourcePlan, sourcePlanBitSet);
+            Expression replacedExpression =
+                    
expressionShuttledToRewrite.accept(materializedViewExprReplacer, null);
+            if (!materializedViewExprReplacer.isValid()) {
+                // if contains any slot to rewrite, which means can not be 
rewritten by target,

Review Comment:
   The comment on line 650 has a grammatical issue. "can not" should be 
"cannot" (single word) for proper grammar.
   
   Current: `// if contains any slot to rewrite, which means can not be 
rewritten by target,`
   Suggested: `// if contains any slot to rewrite, which means cannot be 
rewritten by target,`
   ```suggestion
                   // if contains any slot to rewrite, which means cannot be 
rewritten by target,
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/mtmv/MTMVPlanUtil.java:
##########
@@ -145,6 +145,7 @@ public static ConnectContext createBasicMvContext(@Nullable 
ConnectContext paren
         ctx.getSessionVariable().skipStorageEngineMerge = false;
         ctx.getSessionVariable().showHiddenColumns = false;
         ctx.getSessionVariable().allowModifyMaterializedViewData = true;

Review Comment:
   The addition of `skipPrunePredicate = true` lacks explanation in the code. 
While this change is necessary for the fix (based on the PR description), it's 
not clear why skipping predicate pruning is needed for materialized views.
   
   Consider adding a comment explaining why this is necessary:
   ```java
   // Skip predicate pruning to preserve all predicates needed for materialized 
view rewriting,
   // especially when dealing with complex scenarios like CTEs and group sets
   ctx.getSessionVariable().skipPrunePredicate = true;
   ```
   
   This would help future maintainers understand the purpose of this setting.
   ```suggestion
           ctx.getSessionVariable().allowModifyMaterializedViewData = true;
           // Skip predicate pruning to preserve all predicates needed for 
materialized view rewriting,
           // especially when dealing with complex scenarios like CTEs and 
group sets.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to