This is an automated email from the ASF dual-hosted git repository.

fjy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-druid.git


The following commit(s) were added to refs/heads/master by this push:
     new 0f6cb1e  Update theta/hll sketch doc comparison (#7407)
0f6cb1e is described below

commit 0f6cb1e7e032081a569bbcf0c89220f0e6b53472
Author: Jonathan Wei <[email protected]>
AuthorDate: Wed Apr 3 15:21:33 2019 -0700

    Update theta/hll sketch doc comparison (#7407)
---
 docs/content/querying/aggregations.md | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/docs/content/querying/aggregations.md 
b/docs/content/querying/aggregations.md
index eef4c68..2e253fe 100644
--- a/docs/content/querying/aggregations.md
+++ b/docs/content/querying/aggregations.md
@@ -275,19 +275,28 @@ The [DataSketches Theta 
Sketch](../development/extensions-core/datasketches-thet
 
 #### DataSketches HLL Sketch
 
-The [DataSketches HLL 
Sketch](../development/extensions-core/datasketches-hll.html) 
extension-provided aggregator gives distinct count estimates using the 
HyperLogLog algorithm. The HLL Sketch is faster and requires less storage than 
the Theta Sketch, but does not support intersection or difference operations.
+The [DataSketches HLL 
Sketch](../development/extensions-core/datasketches-hll.html) 
extension-provided aggregator gives distinct count estimates using the 
HyperLogLog algorithm.
+
+Compared to the Theta sketch, the HLL sketch does not support set operations 
and has slightly slower update and merge speed, but requires significantly less 
space.
 
 #### Cardinality/HyperUnique (Deprecated)
 
 <div class="note caution">
-The Cardinality and HyperUnique aggregators are deprecated. Please use <a 
href="../development/extensions-core/datasketches-hll.html">DataSketches HLL 
Sketch</a> instead.
+The Cardinality and HyperUnique aggregators are deprecated. Please use <a 
href="../development/extensions-core/datasketches-theta.html">DataSketches 
Theta Sketch</a> or <a 
href="../development/extensions-core/datasketches-hll.html">DataSketches HLL 
Sketch</a> instead.
 </div>
 
-The [Cardinality and HyperUnique](../querying/hll-old.html) aggregators are 
older aggregator implementations available by default in Druid that also 
provide distinct count estimates using the HyperLogLog algorithm. The newer 
[DataSketches HLL Sketch](../development/extensions-core/datasketches-hll.html) 
extension-provided aggregator has superior accuracy and performance and is 
recommended instead. 
+The [Cardinality and HyperUnique](../querying/hll-old.html) aggregators are 
older aggregator implementations available by default in Druid that also 
provide distinct count estimates using the HyperLogLog algorithm. The newer 
DataSketches Theta and HLL extension-provided aggregators described above have 
superior accuracy and performance and are recommended instead. 
 
 The DataSketches team has published a [comparison 
study](https://datasketches.github.io/docs/HLL/HllSketchVsDruidHyperLogLogCollector.html)
 between Druid's original HLL algorithm and the DataSketches HLL algorithm. 
Based on the demonstrated advantages of the DataSketches implementation, we 
have deprecated Druid's original HLL aggregator.
 
-Please note that DataSketches HLL aggregators and `hyperUnique` aggregators 
are not mutually compatible.
+Please note that `hyperUnique` aggregators are not mutually compatible with 
Datasketches HLL or Theta sketches.
+
+##### Multi-column handling
+
+Note the DataSketches Theta and HLL aggregators currently only support 
single-column inputs. If you were previously using the Cardinality aggregator 
with multiple-column inputs, equivalent operations using Theta or HLL sketches 
are described below:
+
+* Multi-column `byValue` Cardinality can be replaced with a union of Theta 
sketches on the individual input columns
+* Multi-column `byRow` Cardinality can be replaced with a Theta or HLL sketch 
on a single [virtual column]((../querying/virtual-columns.html) that combines 
the individual input columns.
 
 ### Histograms and quantiles
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to