Copilot commented on code in PR #57741: URL: https://github.com/apache/doris/pull/57741#discussion_r2522712222
########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/MaterializedViewExprReplacer.java: ########## @@ -0,0 +1,114 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.rules.exploration.mv; + +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.VirtualSlotReference; +import org.apache.doris.nereids.trees.expressions.functions.scalar.GroupingScalarFunction; +import org.apache.doris.nereids.trees.expressions.visitor.DefaultExpressionRewriter; +import org.apache.doris.nereids.trees.plans.Plan; +import org.apache.doris.nereids.trees.plans.algebra.Repeat; +import org.apache.doris.nereids.util.ExpressionUtils; + +import java.util.ArrayList; +import java.util.BitSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Expression replacer which support to handle VirtualSlotReference, if replaceMap contained, just use it + */ +public class MaterializedViewExprReplacer extends DefaultExpressionRewriter<Void> { + + private final Map<? extends Expression, ? extends Expression> replaceMap; + private final Plan queryTopPlan; + private final BitSet sourcePlanBitSet; + private boolean valid = true; + + public MaterializedViewExprReplacer( + Map<? extends Expression, ? extends Expression> replaceMap, + Plan queryTopPlan, + BitSet sourcePlanBitSet) { + this.replaceMap = replaceMap; + this.queryTopPlan = queryTopPlan; + this.sourcePlanBitSet = sourcePlanBitSet; + } + + public boolean isValid() { + return valid; + } + + @Override + public Expression visit(Expression expr, Void context) { + if (!valid) { + return expr; + } + Expression replacedExpr = replaceMap.get(expr); + if (replacedExpr != null) { + return replacedExpr; + } + List<Expression> newChildren = new ArrayList<>(); + boolean hasChanged = false; + for (Expression child : expr.children()) { + Expression newChild = child.accept(this, context); + if (!valid) { + return expr; + } + if (newChild != child) { + hasChanged = true; + } + newChildren.add(newChild); + } + return hasChanged ? expr.withChildren(newChildren) : expr; + } + + @Override + public Expression visitSlot(Slot slot, Void context) { + if (!valid) { + return slot; + } + if (slot instanceof VirtualSlotReference) { + return handleVirtualSlot((VirtualSlotReference) slot, context); + } + Expression replacedExpr = replaceMap.get(slot); + if (replacedExpr != null) { + return replacedExpr; + } + valid = false; + return slot; + } + Review Comment: Missing JavaDoc for the `handleVirtualSlot` method. This is a critical method that implements the core logic for handling VirtualSlotReference. It should document: - The purpose of the method - What happens when originExpression is not present vs. present - The role of `shuttleExpressionWithLineage` in the context - Return value semantics Suggested addition: ```java /** * Handles VirtualSlotReference replacement for grouping sets. * * @param virtualSlot the VirtualSlotReference to handle * @param context the rewrite context (currently unused) * @return a new VirtualSlotReference with the grouping function replaced according to the replaceMap, * or a virtual grouping ID slot if no origin expression exists */ ``` ```suggestion /** * Handles VirtualSlotReference replacement for grouping sets. * <p> * If the VirtualSlotReference has an origin grouping function, rewrites it using lineage information * and returns a new virtual slot based on the rewritten function. If no origin expression exists, * returns a virtual grouping ID slot. * <p> * The method uses {@code ExpressionUtils.shuttleExpressionWithLineage} to rewrite the grouping function * according to the query plan and source plan bitset. * * @param virtualSlot the VirtualSlotReference to handle * @param context the rewrite context (currently unused) * @return a new VirtualSlotReference with the grouping function replaced according to the replaceMap, * or a virtual grouping ID slot if no origin expression exists */ ``` ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/MaterializedViewExprReplacer.java: ########## @@ -0,0 +1,114 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.rules.exploration.mv; + +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.VirtualSlotReference; +import org.apache.doris.nereids.trees.expressions.functions.scalar.GroupingScalarFunction; +import org.apache.doris.nereids.trees.expressions.visitor.DefaultExpressionRewriter; +import org.apache.doris.nereids.trees.plans.Plan; +import org.apache.doris.nereids.trees.plans.algebra.Repeat; +import org.apache.doris.nereids.util.ExpressionUtils; + +import java.util.ArrayList; +import java.util.BitSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Expression replacer which support to handle VirtualSlotReference, if replaceMap contained, just use it Review Comment: The class-level documentation comment should be more descriptive. It should explain: 1. What problem this class solves (handling VirtualSlotReference in expression replacement for MV rewrite) 2. The context in which it's used (materialized view query rewriting with group sets/CTEs) 3. The key behaviors, especially the `valid` flag mechanism Current comment: "Expression replacer which support to handle VirtualSlotReference, if replaceMap contained, just use it" Suggested improvement: ```java /** * Expression replacer for materialized view query rewriting that handles VirtualSlotReference. * This replacer is used when rewriting queries with GROUPING SETS and CTEs against materialized views. * * <p>For VirtualSlotReference (used in group sets with grouping functions), this replacer: * - Uses the replaceMap if the VirtualSlotReference is contained * - Otherwise, reconstructs the virtual slot by shuttling the origin GroupingScalarFunction * through the query plan lineage and applying the replacement recursively * * <p>The replacer maintains a `valid` flag that becomes false if any expression cannot be * successfully replaced, allowing callers to detect incomplete rewrites. */ ``` ```suggestion * Expression replacer for materialized view query rewriting that handles VirtualSlotReference. * This replacer is used when rewriting queries with GROUPING SETS and CTEs against materialized views. * * <p>For VirtualSlotReference (used in group sets with grouping functions), this replacer: * - Uses the replaceMap if the VirtualSlotReference is contained * - Otherwise, reconstructs the virtual slot by shuttling the origin GroupingScalarFunction * through the query plan lineage and applying the replacement recursively * * <p>The replacer maintains a {@code valid} flag that becomes false if any expression cannot be * successfully replaced, allowing callers to detect incomplete rewrites. ``` ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewAggregateRule.java: ########## @@ -765,6 +767,8 @@ public Expression visitSlot(Slot slot, AggregateExpressionRewriteContext rewrite return Repeat.generateVirtualGroupingIdSlot(); } else { GroupingScalarFunction groupingScalarFunction = originExpression.get(); Review Comment: Missing explanation for why `shuttleExpressionWithLineage` is needed here. This code change adds lineage shuttling for the grouping scalar function before replacement, which is a key part of fixing the CTE + group sets issue. Consider adding a comment: ```java // Shuttle the grouping function through the query plan to resolve references // correctly when CTEs are involved, ensuring proper slot mapping groupingScalarFunction = (GroupingScalarFunction) ExpressionUtils.shuttleExpressionWithLineage( groupingScalarFunction, rewriteContext.getQueryTopPlan(), new BitSet()); ``` ```suggestion GroupingScalarFunction groupingScalarFunction = originExpression.get(); // Shuttle the grouping function through the query plan to resolve references // correctly when CTEs are involved, ensuring proper slot mapping. ``` ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewAggregateRule.java: ########## @@ -244,7 +244,9 @@ protected LogicalAggregate<Plan> aggregateRewriteByView( if (queryAggregate.getSourceRepeat().isPresent()) { // construct group sets for repeat List<List<Expression>> rewrittenGroupSetsExpressions = new ArrayList<>(); Review Comment: This change modifies how grouping sets are retrieved for queries with CTEs. While the logic appears correct (preferring the actual LogicalRepeat node if found via collectFirst, then falling back to sourceRepeat), it lacks an explanatory comment about why this change is necessary. Consider adding a comment explaining the issue this fixes: ```java // When queries use both CTEs and group sets, the sourceRepeat may not reflect // the actual LogicalRepeat node in the plan tree. Collect the LogicalRepeat directly // from the plan to get the correct grouping sets, falling back to sourceRepeat if needed. List<List<Expression>> groupingSets = queryAggregate.collectFirst(LogicalRepeat.class::isInstance) .map(repeat -> ((LogicalRepeat<? extends Plan>) repeat).getGroupingSets()) .orElse(queryAggregate.getSourceRepeat().get().getGroupingSets()); ``` This would help future maintainers understand the fix for the CTE + group sets issue mentioned in the PR description. ```suggestion List<List<Expression>> rewrittenGroupSetsExpressions = new ArrayList<>(); // When queries use both CTEs and grouping sets, the sourceRepeat may not reflect // the actual LogicalRepeat node in the plan tree. Collect the LogicalRepeat directly // from the plan to get the correct grouping sets, falling back to sourceRepeat if needed. ``` ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewRule.java: ########## @@ -423,7 +422,7 @@ protected List<Plan> doRewrite(StructInfo queryStructInfo, CascadesContext casca rewrittenPlanOutput, queryPlan.getOutput())); continue; } - // Merge project + // Merge project if contains mult same project Review Comment: The comment on line 425 has a grammatical issue. It says "Merge project if contains mult same project" which should be "Merge project if it contains multiple identical projects" or similar. Current: `// Merge project if contains mult same project` Suggested: `// Merge project if it contains multiple identical projects` ```suggestion // Merge project if it contains multiple identical projects ``` ########## fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewRule.java: ########## @@ -641,11 +640,14 @@ protected List<Expression> rewriteExpression(List<? extends Expression> sourceEx expressionShuttledToRewrite.collectToSet(expression -> expression instanceof SlotReference && ((SlotReference) expression).getDataType() instanceof VariantType); extendMappingByVariant(variants, targetToTargetReplacementMappingQueryBased); - Expression replacedExpression = ExpressionUtils.replace(expressionShuttledToRewrite, - targetToTargetReplacementMappingQueryBased); - Set<Expression> replacedExpressionSlotQueryUsed = replacedExpression.collect(slotsToRewrite::contains); - if (!replacedExpressionSlotQueryUsed.isEmpty()) { - // if contains any slot to rewrite, which means could not be rewritten by target, + + MaterializedViewExprReplacer materializedViewExprReplacer = new MaterializedViewExprReplacer( + targetToTargetReplacementMappingQueryBased, + sourcePlan, sourcePlanBitSet); + Expression replacedExpression = + expressionShuttledToRewrite.accept(materializedViewExprReplacer, null); + if (!materializedViewExprReplacer.isValid()) { + // if contains any slot to rewrite, which means can not be rewritten by target, Review Comment: The comment on line 650 has a grammatical issue. "can not" should be "cannot" (single word) for proper grammar. Current: `// if contains any slot to rewrite, which means can not be rewritten by target,` Suggested: `// if contains any slot to rewrite, which means cannot be rewritten by target,` ```suggestion // if contains any slot to rewrite, which means cannot be rewritten by target, ``` ########## fe/fe-core/src/main/java/org/apache/doris/mtmv/MTMVPlanUtil.java: ########## @@ -145,6 +145,7 @@ public static ConnectContext createBasicMvContext(@Nullable ConnectContext paren ctx.getSessionVariable().skipStorageEngineMerge = false; ctx.getSessionVariable().showHiddenColumns = false; ctx.getSessionVariable().allowModifyMaterializedViewData = true; Review Comment: The addition of `skipPrunePredicate = true` lacks explanation in the code. While this change is necessary for the fix (based on the PR description), it's not clear why skipping predicate pruning is needed for materialized views. Consider adding a comment explaining why this is necessary: ```java // Skip predicate pruning to preserve all predicates needed for materialized view rewriting, // especially when dealing with complex scenarios like CTEs and group sets ctx.getSessionVariable().skipPrunePredicate = true; ``` This would help future maintainers understand the purpose of this setting. ```suggestion ctx.getSessionVariable().allowModifyMaterializedViewData = true; // Skip predicate pruning to preserve all predicates needed for materialized view rewriting, // especially when dealing with complex scenarios like CTEs and group sets. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
