ydgandhi commented on PR #21088:
URL: https://github.com/apache/datafusion/pull/21088#issuecomment-4103479506

   Thanks for the review. I asked cursor to add a few tests
   
   ---
   
   ## Tests for `MultiDistinctCountRewrite` (what they cover)
   
   Optimizer unit tests — 
`datafusion/optimizer/src/multi_distinct_count_rewrite.rs`
   
   | Test | What it asserts |
   |------|-----------------|
   | `rewrites_two_count_distinct` | `GROUP BY a` + `COUNT(DISTINCT b)`, 
`COUNT(DISTINCT c)` → inner joins, per-branch null filters on `b`/`c`, 
`mdc_base` + two `mdc_d` aliases. |
   | `rewrites_global_three_count_distinct` | No `GROUP BY`, three 
`COUNT(DISTINCT …)` → cross/inner join rewrite; **no** `mdc_base` (global-only 
path). |
   | `rewrites_two_count_distinct_with_non_distinct_count` | Grouped BI-style: 
two distincts + `COUNT(a)` → join rewrite with **`mdc_base`** holding the 
non-distinct agg. |
   | `does_not_rewrite_two_count_distinct_same_column` | Two `COUNT(DISTINCT 
b)` with different aliases → **no** rewrite (duplicate distinct key). |
   | `does_not_rewrite_single_count_distinct` | Only one `COUNT(DISTINCT …)` → 
**no** rewrite (rule needs ≥2 distincts). |
   | `rewrites_three_count_distinct_grouped` | Three grouped `COUNT(DISTINCT 
…)` on `b`, `c`, `a` → **two** inner joins + `mdc_base`. |
   | `rewrites_interleaved_non_distinct_between_distincts` | Order 
`COUNT(DISTINCT b)`, `COUNT(a)`, `COUNT(DISTINCT c)` → rewrite + `mdc_base` for 
the middle non-distinct agg (projection order / interleaving). |
   | `rewrites_count_distinct_on_cast_exprs` | `COUNT(DISTINCT CAST(b AS 
Int64))`, same for `c` → rewrite + null filters on the **cast** expressions. |
   | `does_not_rewrite_grouping_sets_multi_distinct` | `GROUPING SETS` 
aggregate with two `COUNT(DISTINCT …)` → **no** rewrite (rule bails on grouping 
sets). |
   | `does_not_rewrite_mixed_agg` | `COUNT(DISTINCT b)` + `COUNT(c)` → **no** 
rewrite (only **one** `COUNT(DISTINCT …)`; rule requires at least two). |
   
   SQL integration — 
`datafusion/core/tests/sql/aggregates/multi_distinct_count_rewrite.rs`
   
   | Test | What it asserts |
   |------|-----------------|
   | `multi_count_distinct_matches_expected_with_nulls` | End-to-end grouped 
two `COUNT(DISTINCT …)` with **NULLs** in distinct columns; exact sorted batch 
string vs expected counts. |
   | `multi_count_distinct_with_count_star_matches_expected` | `COUNT(*)` plus 
two `COUNT(DISTINCT …)` per group (BI-style); exact result table. |
   | `multi_count_distinct_two_group_keys_matches_expected` | **`GROUP BY g1, 
g2`** + two distincts; verifies joins line up on **all** group keys and 
numerics match. |
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to