feiniaofeiafei opened a new pull request, #64618:
URL: https://github.com/apache/doris/pull/64618

   ### What problem does this PR solve?
   
   Issue Number: N/A
   
   Related PR: N/A
   
   Problem Summary:
   
   `InferSetOperatorDistinct` previously relied on a cost-based rewrite job to 
decide whether to add local distinct aggregates under set operations. This 
introduces extra optimizer cost.
   
   This PR changes `InferSetOperatorDistinct` to run as a normal top-down 
rewrite rule and makes the rule decide whether to generate local distinct 
aggregates by child NDV/cardinality statistics. The NDV heuristic follows the 
existing logic in `EagerAggRewriter#checkStats`, but is kept local to 
`InferSetOperatorDistinct` because set-operation local distinct inference and 
eager aggregation pushdown have different optimization boundaries.
   
   Each child of a DISTINCT set operation is judged independently, so one child 
can get a local distinct aggregate while another child can remain unchanged 
when its NDV indicates poor deduplication benefit.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test: Unit Test
       - `./run-fe-ut.sh --run 
org.apache.doris.nereids.rules.rewrite.InferSetOperatorDistinctTest`
   - Behavior changed: Yes. Optimizer rewrite behavior changes for DISTINCT set 
operations; SQL semantics are unchanged.
   - Does this need documentation: No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to