Re: Unique Sketch aggregations and bias correction

2018-09-24 Thread Gian Merlino
I have not. The original HLL paper does have some points in it about bias corrections for small cardinalities, and I am not sure if those are implemented in Druid's HLL implementation. On Mon, Sep 24, 2018 at 8:49 AM Charles Allen wrote: > https://github.com/apache/incubator-druid/pull/5712

Unique Sketch aggregations and bias correction

2018-09-24 Thread Charles Allen
https://github.com/apache/incubator-druid/pull/5712 adds some great functionality to the Datasketches hooks in Druid. One thing noted in https://datasketches.github.io/docs/HLL/HllSketchVsDruidHyperLogLogCollector.html is the severe bias the druid HLL implementation shows at ~5k uniques being fed