leerho commented on code in PR #58:
URL: 
https://github.com/apache/datasketches-python/pull/58#discussion_r1917473640


##########
docs/source/quantiles/index.rst:
##########
@@ -10,17 +10,21 @@ in the stream.
 These sketches may be used to compute approximate histograms, Probability Mass 
Functions (PMFs), or
 Cumulative Distribution Functions (CDFs).
 
-The library provides three types of quantiles sketches, each of which has 
generic items as well as versions
-specific to a given numeric type (e.g. integer or floating point values). All 
three types provide error
-bounds on rank estimation with proven probabilistic error distributions.
+The library provides four types of quantiles sketches, three of which have 
generic items as well as versions
+specific to a given numeric type (e.g. integer or floating point values). 
Those three types provide error
+bounds on rank estimation with proven probabilistic error distributions. 
t-digest is a heuristic-based sketch
+that works only on numeric data, and while the error properties are not 
guaranteed, the sketch typically
+does a good job with small storage.
 
-  * KLL: Provides uniform rank estimation error over the entire range
+  * KLL: Provides uniform rank estimation error over the entire range.
   * REQ: Provides relative rank error estimates, which decreases approaching 
either the high or low end values.
+  * t-digest: Relative rank error estimates, heuristic-based without 
guarantees but quite compact with generally very good error properties.

Review Comment:
   ...(add) with large enough data.



##########
docs/source/quantiles/tdigest.rst:
##########
@@ -0,0 +1,50 @@
+t-digest
+--------
+
+.. currentmodule:: datasketches
+
+The implementation in this library is based on the MergingDigest described in
+`Computing Extremely Accurate Quantiles Using t-Digests 
<https://arxiv.org/abs/1902.04023>`_ by Ted Dunning and Otmar Ertl.
+
+The implementation in this library has a few differences from the reference 
implementation associated with that paper:
+
+* Merge does not modify the input
+* Derialization similar to other sketches in this library, although reading 
the reference implementation format is supported
+
+Unlike all other algorithms in the library, t-digest is empirical and has no 
mathematical basis for estimating its error
+and its results are dependent on the input data. However, for many common data 
distributions, it can produce excellent results.

Review Comment:
   ...(add) with large enough data.



##########
docs/source/quantiles/index.rst:
##########
@@ -10,17 +10,21 @@ in the stream.
 These sketches may be used to compute approximate histograms, Probability Mass 
Functions (PMFs), or
 Cumulative Distribution Functions (CDFs).
 
-The library provides three types of quantiles sketches, each of which has 
generic items as well as versions
-specific to a given numeric type (e.g. integer or floating point values). All 
three types provide error
-bounds on rank estimation with proven probabilistic error distributions.
+The library provides four types of quantiles sketches, three of which have 
generic items as well as versions
+specific to a given numeric type (e.g. integer or floating point values). 
Those three types provide error
+bounds on rank estimation with proven probabilistic error distributions. 
t-digest is a heuristic-based sketch
+that works only on numeric data, and while the error properties are not 
guaranteed, the sketch typically
+does a good job with small storage.

Review Comment:
   ...(add) and large enough input data.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org
For additional commands, e-mail: dev-h...@datasketches.apache.org

Reply via email to