Jackie-Jiang opened a new pull request #8189:
URL: https://github.com/apache/pinot/pull/8189


   ## Description
   Adds `DistinctCountSmartHLLAggregationFunction` which can automatically 
convert the `Set` to `HyperLogLog` if the set size grows too big to protect the 
servers from running out of memory. This conversion only applies to aggregation 
only queries, but not the group-by queries.
   
   By default, when the set size exceeds 100K, it will be converted to a 
HyperLogLog with log2m of 12.
   The log2m and threshold can be configured using the second argument 
(literal) of the function:
   - `hllLog2m`: log2m of the converted HyperLogLog (default 12)
   - `hllConversionThreshold`: set size threshold to trigger the conversion, 
non-positive means never convert (default 100K)
   
   Example query:
   `SELECT DISTINCTCOUNTSMARTHLL(myCol, 'hllLog2m=8;hllConversionThreshold=10') 
FROM myTable`
   
   ## Release Notes
   Adds `DistinctCountSmartHLLAggregationFunction` which automatically stores 
distinct values in Set or HyperLogLog based on cardinality


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to