seawinde opened a new pull request, #63899:
URL: https://github.com/apache/doris/pull/63899
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
Complex partitioned async MTMV creation can spend excessive FE CPU in
partition lineage analysis. The hot path repeatedly shuttles partition and
checked expressions through the full plan lineage replacer, so wide `UNION
ALL`, join, and aggregate plans multiply the same plan walks during `CREATE
MATERIALIZED VIEW` analysis.
Root cause: In
`PartitionIncrementMaintainer.PartitionIncrementChecker.checkPartition()`, each
partition candidate and checked expression calls
`ExpressionUtils.shuttleExpressionWithLineage()` separately. Each call
traverses the plan through `ExpressionLineageReplacer` and rebuilds equivalent
normalized expressions.
Change Summary:
| File | Change Description |
|------|-------------------|
| `PartitionIncrementMaintainer.java` | Batch lineage shuttle calls, cache
lineage-visible named expressions by plan identity, cache normalized
expressions, and reuse the normalization rewrite context during one partition
increment check. |
| `PartitionColumnTraceTest.java` | Add a CTE plus `UNION ALL` plus wide
aggregate lineage test to keep partition lineage behavior covered. |
| `test_mtmv_partition_lineage_performance.groovy` | Add a desensitized
static SQL performance regression case for the complex partitioned MTMV shape. |
Design Rationale: The change keeps the existing `ExpressionLineageReplacer`
semantics and limits caching to a single `PartitionIncrementCheckContext`. This
avoids sharing mutable analysis state across optimizer contexts while removing
repeated full plan walks for the same plan and expression set.
### Release note
Improve performance when creating complex partitioned async materialized
views.
### Check List (For Author)
- Test
- [ ] Regression test
- [x] Unit Test
- [x] Manual test (add detailed scripts or steps below)
- `./run-fe-ut.sh --run
org.apache.doris.nereids.rules.exploration.mv.PartitionColumnTraceTest`
- `git diff --check`
- Tried `./run-regression-test.sh --run -d performance_p0 -s
test_mtmv_partition_lineage_performance`, but the local Doris FE was not
running on `127.0.0.1:9030`, so the regression could not execute SQL.
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
- Behavior changed:
- [x] No.
- [ ] Yes.
- Does this need documentation?
- [x] No.
- [ ] Yes.
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]