JeelRajodiya opened a new pull request, #22707:
URL: https://github.com/apache/datafusion/pull/22707

   **Adds Boolean support to `approx_distinct`**
   
   Currently `approx_distinct(bool_col)` errors with *"Support for 
'approx_distinct' for data type Boolean is not implemented"*. PR #21453 
introduced the `ApproxDistinctBitmapWrapper` pattern for small ints (closes 
#1109) and incidentally removed a `// TODO support for boolean (trivial case)` 
comment without wiring up Boolean. This change completes that intent.
   
   **Approach**
   
   Adds a `BooleanDistinctCountAccumulator` (a `[bool; 2]` flag pair) in 
`functions-aggregate-common`, plugged into the existing 
`ApproxDistinctBitmapWrapper` alongside the small-int bitmap accumulators. 
Result is **exact** (0, 1, or 2) since Boolean has bounded cardinality — 
strictly better than HLL approximation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to