gianm commented on issue #6814: [Discuss] Replacing hyperUnique as 'default' distinct count sketch URL: https://github.com/apache/incubator-druid/issues/6814#issuecomment-452859631 > Now to the issue of moving form Hyper unique to HllSketch I am kind of sure this kind of question will re occur again and again and every-time that a new approximate method outperform a an old one or maybe offers different tradeoffs. This tells me that probably the best way to solve this is to add a built in UDF for every different sketch algorithm with its respective parameter, this will give the user access to all the core supported sketches without issue of compatibilities. I think we want to do that too (like, provide Druid SQL functions so users can choose whether to do approx count distinct with druid-hll, datasketches-hll, datasketches-theta, what-have-you). I think it'd also be nice to also have a generic `APPROX_COUNT_DISTINCT` function that uses the "correct" sketch aggregator based on what format of sketch you have actually stored in your segments. Something that makes life easier for users. And maybe give it a concept of the 'current best' one to use, and have it use that if you don't specify a specific one.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
