Jackie-Jiang opened a new pull request #8189: URL: https://github.com/apache/pinot/pull/8189
## Description Adds `DistinctCountSmartHLLAggregationFunction` which can automatically convert the `Set` to `HyperLogLog` if the set size grows too big to protect the servers from running out of memory. This conversion only applies to aggregation only queries, but not the group-by queries. By default, when the set size exceeds 100K, it will be converted to a HyperLogLog with log2m of 12. The log2m and threshold can be configured using the second argument (literal) of the function: - `hllLog2m`: log2m of the converted HyperLogLog (default 12) - `hllConversionThreshold`: set size threshold to trigger the conversion, non-positive means never convert (default 100K) Example query: `SELECT DISTINCTCOUNTSMARTHLL(myCol, 'hllLog2m=8;hllConversionThreshold=10') FROM myTable` ## Release Notes Adds `DistinctCountSmartHLLAggregationFunction` which automatically stores distinct values in Set or HyperLogLog based on cardinality -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
