gianm commented on issue #6814: [Discuss] Replacing hyperUnique as 'default' 
distinct count sketch
URL: 
https://github.com/apache/incubator-druid/issues/6814#issuecomment-452859631
 
 
   > Now to the issue of moving form Hyper unique to HllSketch I am kind of 
sure this kind of question will re occur again and again and every-time that a 
new approximate method outperform a an old one or maybe offers different 
tradeoffs. This tells me that probably the best way to solve this is to add a 
built in UDF for every different sketch algorithm with its respective 
parameter, this will give the user access to all the core supported sketches 
without issue of compatibilities.
   
   I think we want to do that too (like, provide Druid SQL functions so users 
can choose whether to do approx count distinct with druid-hll, 
datasketches-hll, datasketches-theta, what-have-you). I think it'd also be nice 
to also have a generic `APPROX_COUNT_DISTINCT` function that uses the "correct" 
sketch aggregator based on what format of sketch you have actually stored in 
your segments. Something that makes life easier for users. And maybe give it a 
concept of the 'current best' one to use, and have it use that if you don't 
specify a specific one.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to