This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datasketches-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 40310d60 Automatic Site Publish by Buildbot
40310d60 is described below
commit 40310d605efcb6b5e168741492c7caa4c8a59b43
Author: buildbot <[email protected]>
AuthorDate: Fri Sep 30 18:38:20 2022 +0000
Automatic Site Publish by Buildbot
---
output/docs/KLL/KLLAccuracyAndSize.html | 13 ++++++-------
output/docs/KLL/KLLSketch.html | 23 ++++++++++++++---------
output/docs/Quantiles/QuantilesOverview.html | 2 +-
3 files changed, 21 insertions(+), 17 deletions(-)
diff --git a/output/docs/KLL/KLLAccuracyAndSize.html
b/output/docs/KLL/KLLAccuracyAndSize.html
index b718455f..3fac9a46 100644
--- a/output/docs/KLL/KLLAccuracyAndSize.html
+++ b/output/docs/KLL/KLLAccuracyAndSize.html
@@ -510,17 +510,16 @@
-->
<h1 id="kll-sketch-accuracy-and-size">KLL Sketch Accuracy and Size</h1>
-<p>The accuracy of a quantile sketch is a function of the configured value
<i>K</i>, which also affects
-the overall size of the sketch (default K = 200).</p>
+<p>The accuracy of the KLL quantile sketch is a function of the configured
<i>K</i>, which also affects the overall size of the sketch (default K =
200).</p>
-<p>The accuracy quantiles sketches is specified and measured with respect to
the <em>rank</em> only, not the values.</p>
+<p>The accuracy of quantiles sketches is specified and measured with respect
to the <em>rank</em> only, not the quantiles.</p>
-<p>The KLL Sketch has <em>absolute error</em>. For example, a specified
accuracy of 1% at the median (rank = 0.50) means that the true value (if you
could extract it from the set) should be
-between <em>getQuantile(0.49)</em> and <em>getQuantile(0.51)</em>. This same
1% error applied at a rank of 0.95 means that the true value should be between
<em>getQuantile(0.94)</em> and <em>getQuantile(0.96)</em>. In other words, the
error is a fixed +/- epsilon for the entire range of rank values.</p>
+<p>The KLL Sketch has <em>absolute error</em>. For example, a specified rank
accuracy of 1% at the median (rank = 0.50) means that the true quantile (if you
could extract it from the set) should be between <em>getQuantile(0.49)</em> and
<em>getQuantile(0.51)</em>.
+This same 1% error applied at a rank of 0.95 means that the true quantile
should be between <em>getQuantile(0.94)</em> and <em>getQuantile(0.96)</em>. In
other words, the error is a fixed +/- epsilon for the entire range of ranks.</p>
<p>The approximate rank error values listed in the second row of the header in
the table below can be computed using the function
<i>KLLSketch.getNormalizedRankError(int k, false)</i>. The third row shows the
double-sided error that applies to a portion of the distribution such as an
element of PMF (bar in a histogram) that is a subject to rank error on both
sides. It can be computed using the function
<i>KLLSketch.getNormalizedRankError(int k, true)</i>.</p>
-<h2
id="kllfloatssketch-java-or-kll_sketchfloat-c-serialized-size-in-bytes-and-rank-error">KllFloatsSketch
(Java) or kll_sketch<float> (C++) serialized size in bytes and rank
error</h2>
+<h2
id="kllfloatssketch-java-or-kll_sketchfloat-c-serialized-size-in-bytes-from-k-or-rank-error--vs-n">KllFloatsSketch
(Java) or kll_sketch<float> (C++) serialized size in bytes from
<em>K</em> or rank error % vs. <em>N</em>.</h2>
<table>
<thead>
@@ -919,7 +918,7 @@ between <em>getQuantile(0.49)</em> and
<em>getQuantile(0.51)</em>. This same 1%
</tbody>
</table>
-<h2
id="klldoublessketch-java-or-kll_sketchdouble-c-serialized-size-in-bytes-and-rank-error">KllDoublesSketch
(Java) or kll_sketch<double> (C++) serialized size in bytes and rank
error</h2>
+<h2
id="klldoublessketch-java-or-kll_sketchdouble-c-serialized-size-in-bytes-from-k-or-rank-error--vs-n">KllDoublesSketch
(Java) or kll_sketch<double> (C++) serialized size in bytes from
<em>K</em> or rank error % vs. <em>N</em>.</h2>
<table>
<thead>
diff --git a/output/docs/KLL/KLLSketch.html b/output/docs/KLL/KLLSketch.html
index c95860ca..dbf5d1f2 100644
--- a/output/docs/KLL/KLLSketch.html
+++ b/output/docs/KLL/KLLSketch.html
@@ -514,9 +514,14 @@
See <a href="https://arxiv.org/abs/1603.05346v2">Optimal Quantile
Approximation in Streams, by Zohar Karnin, Kevin Lang, Edo Liberty</a>.
The name KLL is composed of the initial letters of the last names of the
authors.</p>
-<p>The usage of KllSketch is very similar to DoublesSketch. The key feature of
this sketch is its compactness for a given accuracy. It is implemented with
both float and double values and can be configured for use on-heap or off-heap
(Direct mode).
-The parameter K that affects the accuracy and the size of the sketch is not
restricted to powers of 2.
-The default of 200 was chosen to yield approximately the same normalized rank
error (1.65%) as the original DoublesSketch (K=128, error 1.73%).</p>
+<p>The usage of KllSketch is very similar to the classic quantiles
DoublesSketch.</p>
+
+<ul>
+ <li>The key feature of this sketch is its compactness for a given
accuracy.</li>
+ <li>It is separately implemented for both float and double values and can be
configured for use on-heap or off-heap (Direct mode).</li>
+ <li>The parameter K that affects the accuracy and the size of the sketch is
not restricted to powers of 2.</li>
+ <li>The default of 200 was chosen to yield approximately the same normalized
rank error (1.65%) as the classic quantiles DoublesSketch (K=128, error
1.73%).</li>
+</ul>
<h3 id="java-example">Java example</h3>
@@ -534,18 +539,18 @@ double rankOf1000 = sketch.getRank(1000);
<h3
id="differences-of-kllsketch-from-the-original-quantiles-doublessketch">Differences
of KllSketch from the original quantiles DoublesSketch</h3>
<ul>
- <li>KLL has a smaller size for the same accuracy</li>
- <li>KLL is slightly faster to update</li>
- <li>The KLL parameter K doesn’t have to be power of 2</li>
- <li>KLL operates with either float values or double values</li>
- <li>KLL uses a merge method rather than a union object</li>
+ <li>KLL has a smaller size for the same accuracy.</li>
+ <li>KLL is slightly faster to update.</li>
+ <li>The KLL parameter K doesn’t have to be power of 2.</li>
+ <li>KLL operates with either float values or double values.</li>
+ <li>KLL uses a merge method rather than a union object.</li>
</ul>
<p>The starting point for the comparison is setting K in such a way that rank
error would be approximately the same. As pointed out above, the default K for
both sketches should achieve this. Here is the comparison of the single-sided
normalized rank error (getRank() method) for the default K:</p>
<p><img class="doc-img-full"
src="/docs/img/kll/kll200-vs-ds128-rank-error.png" alt="RankError" /></p>
-<p>DoublesSketch has two forms with different serialized sizes:
UpdateDoublesSketch and CompactDoublesSketch. The KLL sketches makes this
distinction differently. When the KllSketch is serialized using
<em>toByteArray()</em> it is always in a compact form and immutable. When the
KllSketch is on-heap it is always updatable. It can be created off-heap using
the static factory method <em>newDirectInstance(…)</em> method, which is also
updatable. It is possible to move from off-heap (Direct) [...]
+<p>The classic quantiles DoublesSketch has two forms with different serialized
sizes: UpdateDoublesSketch and CompactDoublesSketch. The KLL sketches makes
this distinction differently. When the KllSketch is serialized using
<em>toByteArray()</em> it is always in a compact form and immutable. When the
KllSketch is on-heap it is always updatable. It can be created off-heap using
the static factory method <em>newDirectInstance(…)</em> method, which is also
updatable. It is possible to move [...]
<p>Here is the comparison of serialized sizes:</p>
diff --git a/output/docs/Quantiles/QuantilesOverview.html
b/output/docs/Quantiles/QuantilesOverview.html
index 3524a45f..2b02270f 100644
--- a/output/docs/Quantiles/QuantilesOverview.html
+++ b/output/docs/Quantiles/QuantilesOverview.html
@@ -514,7 +514,7 @@
<p>This is an overview of the three types of quantiles sketches in the
library. Each of these quantile types may have one or more specific
implementtaions.</p>
-<p>The mathematical error bounds of all the quantile sketches is specified
with respect to rank and not with respect to quantile values. In other words,
the difference between the rank upper bound and the rank lower bound is the
confidence interval and can be expressed as a percent of the overall rank
distribution (which is 1.0) and is the mathematically derived error for a
specific configuration of the sketch.</p>
+<p>The mathematical error bounds of all the quantile sketches is specified
with respect to rank and not with respect to quantiles. In other words, the
difference between the rank upper bound and the rank lower bound is the
confidence interval and can be expressed as a percent of the overall rank
distribution (which is 1.0) and is the mathematically derived error for a
specific configuration of the sketch.</p>
<p>Although the quantile upper bound and quantile lower bounds can be
approximately computed from the rank upper bound and rank lower bound, and the
difference between the quantile bounds is also an approximate confidence
interval, the size of the quantile confidence interval may not be meaningful
and is not constrained by the defined error of the sketch.</p>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]