seawinde opened a new pull request, #63899:
URL: https://github.com/apache/doris/pull/63899

   ### What problem does this PR solve?
   
   Issue Number: N/A
   
   Related PR: N/A
   
   Problem Summary:
   Complex partitioned async MTMV creation can spend excessive FE CPU in 
partition lineage analysis. The hot path repeatedly shuttles partition and 
checked expressions through the full plan lineage replacer, so wide `UNION 
ALL`, join, and aggregate plans multiply the same plan walks during `CREATE 
MATERIALIZED VIEW` analysis.
   
   Root cause: In 
`PartitionIncrementMaintainer.PartitionIncrementChecker.checkPartition()`, each 
partition candidate and checked expression calls 
`ExpressionUtils.shuttleExpressionWithLineage()` separately. Each call 
traverses the plan through `ExpressionLineageReplacer` and rebuilds equivalent 
normalized expressions.
   
   Change Summary:
   
   | File | Change Description |
   |------|-------------------|
   | `PartitionIncrementMaintainer.java` | Batch lineage shuttle calls, cache 
lineage-visible named expressions by plan identity, cache normalized 
expressions, and reuse the normalization rewrite context during one partition 
increment check. |
   | `PartitionColumnTraceTest.java` | Add a CTE plus `UNION ALL` plus wide 
aggregate lineage test to keep partition lineage behavior covered. |
   | `test_mtmv_partition_lineage_performance.groovy` | Add a desensitized 
static SQL performance regression case for the complex partitioned MTMV shape. |
   
   Design Rationale: The change keeps the existing `ExpressionLineageReplacer` 
semantics and limits caching to a single `PartitionIncrementCheckContext`. This 
avoids sharing mutable analysis state across optimizer contexts while removing 
repeated full plan walks for the same plan and expression set.
   
   ### Release note
   
   Improve performance when creating complex partitioned async materialized 
views.
   
   ### Check List (For Author)
   
   - Test
       - [ ] Regression test
       - [x] Unit Test
       - [x] Manual test (add detailed scripts or steps below)
           - `./run-fe-ut.sh --run 
org.apache.doris.nereids.rules.exploration.mv.PartitionColumnTraceTest`
           - `git diff --check`
           - Tried `./run-regression-test.sh --run -d performance_p0 -s 
test_mtmv_partition_lineage_performance`, but the local Doris FE was not 
running on `127.0.0.1:9030`, so the regression could not execute SQL.
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason
   
   - Behavior changed:
       - [x] No.
       - [ ] Yes.
   
   - Does this need documentation?
       - [x] No.
       - [ ] Yes.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to