feiniaofeiafei opened a new pull request, #64618:
URL: https://github.com/apache/doris/pull/64618
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary:
`InferSetOperatorDistinct` previously relied on a cost-based rewrite job to
decide whether to add local distinct aggregates under set operations. This
introduces extra optimizer cost.
This PR changes `InferSetOperatorDistinct` to run as a normal top-down
rewrite rule and makes the rule decide whether to generate local distinct
aggregates by child NDV/cardinality statistics. The NDV heuristic follows the
existing logic in `EagerAggRewriter#checkStats`, but is kept local to
`InferSetOperatorDistinct` because set-operation local distinct inference and
eager aggregation pushdown have different optimization boundaries.
Each child of a DISTINCT set operation is judged independently, so one child
can get a local distinct aggregate while another child can remain unchanged
when its NDV indicates poor deduplication benefit.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- `./run-fe-ut.sh --run
org.apache.doris.nereids.rules.rewrite.InferSetOperatorDistinctTest`
- Behavior changed: Yes. Optimizer rewrite behavior changes for DISTINCT set
operations; SQL semantics are unchanged.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]