This is an automated email from the ASF dual-hosted git repository.

leerho pushed a commit to branch master
in repository 
https://gitbox.apache.org/repos/asf/incubator-datasketches-website.git


The following commit(s) were added to refs/heads/master by this push:
     new a7c07ec  Add documentation for ReqSketch
a7c07ec is described below

commit a7c07ec44a17f40e0af2e78fadba214f9c78a4b1
Author: Lee Rhodes <[email protected]>
AuthorDate: Mon Oct 19 12:04:02 2020 -0700

    Add documentation for ReqSketch
---
 docs/REQ/ReqAccuracyAdversarial.md    |  80 ++++++++++++++++++++++++++++++----
 docs/REQ/ReqAccuracyRandomShuffled.md |  33 +++++---------
 docs/img/req/FlipFlopPattern.png      | Bin 0 -> 21053 bytes
 docs/img/req/RandomPattern.png        | Bin 0 -> 21357 bytes
 docs/img/req/ReversedPattern.png      | Bin 0 -> 21011 bytes
 docs/img/req/SortedPattern.png        | Bin 0 -> 20843 bytes
 docs/img/req/SqrtPattern.png          | Bin 0 -> 20958 bytes
 docs/img/req/ZoominPattern.png        | Bin 0 -> 20951 bytes
 docs/img/req/ZoomoutPattern.png       | Bin 0 -> 21077 bytes
 9 files changed, 83 insertions(+), 30 deletions(-)

diff --git a/docs/REQ/ReqAccuracyAdversarial.md 
b/docs/REQ/ReqAccuracyAdversarial.md
index 4dbc335..cd82c18 100644
--- a/docs/REQ/ReqAccuracyAdversarial.md
+++ b/docs/REQ/ReqAccuracyAdversarial.md
@@ -21,20 +21,82 @@ layout: doc_page
 -->
 # ReqSketch Accuracy with Adversarial Streams
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T12_LT_Sh.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+This set of tests characterize the accuracy (or more precisely the rank error) 
of the ReqSketch using specifically selected adversarial streams.  The goal of 
this suite of tests is to understand how the rank error of the sketch behaves 
across all ranks with these specific stream patterns.  All of these tests are 
run with the same configuration except for the choice of the adversarial stream 
pattern.
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T12_LT_NoSh.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+The design of these tests is quite different from the tests for the *Random 
Shuffled Streams*.  Here, each test has one pattern and running multiple trials 
on the same pattern will not produce a nice distribution of error that we can 
easily analyze. We would like to capture the ranks where the pattern creates 
the largest error. These aberrant ranks could occur anywhere in the stream.  
Instead of choosing 100 plot points where the error is exclusively measured, we 
want to measure the erro [...]
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Sorted.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+In this case we collect the statistics of all the errors in 100 contiguous 
intervals of the stream. For a stream length of 2^20, each interval consists of 
about ten thousand values.  The errors from these 10K values are fed into a 
standard quantile sketch as before, and we extract 3 statical quantile points, 
-3SD, median and +3SD, and plot those 3 values at each of the 100 plot points. 
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Reversed.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+As you can see, some of these patterns challenge our current a priori 
calculation of the error bounds, which means we may need to adjust them 
somewhat. If we do, these plots will be regenerated. 
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Random.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+For those that are interested in the actual code that run these tests can 
examine the following links.
+ 
+* 
[Code](https://github.com/apache/incubator-datasketches-characterization/blob/master/src/test/java/org/apache/datasketches/characterization/quantiles/ReqSketchAccuracyProfile2.java):
 The code used to generate these characterization studies.
+* 
[Config](https://github.com/apache/incubator-datasketches-characterization/blob/master/src/main/resources/quantiles/ReqSketchAccuracy2Job.conf):
 The human readable and editable configuration file that instructs the above 
code with the specific properties used to run the test. These configuration 
properties are different for each of the following plots and summarized below 
with each plot.
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+## Test Design
+* Stream Length (SL): 2^20
+* Stream Values: Natural numbers, &#x2115;<sub>1</sub>, from 1 to SL, 
expressed as 32-bit floats.
+* Y-axis: The absolute error of the sketch *getRank(value)* method.
+* X-axis: The normalized rank [0.0, 1.0]
+* Plot Points (PP): 100.  Equally spaced points along the X-axis starting at 
*1.0/SL* and ending at 1.0. 
+* Trial:
+       * The stream is generated according to the chosen adversarial pattern.
+       * At each Plot Point, we compute the rank errors of all ~10K points 
from the preceding PP to the current PP.
+       * These 10K error values are fed into an error quantile sketch 
associated with the current PP.
+       * 3 quantile values (-3sd, Median, +3sd) are extracted from each error 
sketch. These error quantiles correspond to the standard normal distribuion at 
the median, and +/- 3SD, where SD stands for Standard Deviation from the mean.
+* Plotting:
+       * Each of the error quantiles are connected by lines to form contours 
of the error distribution where the area between the +/- 3SD contours is the 
99.7% confidence interval.
+       * In addition to the error contours. 6 dashed contours (with colors 
corresponding to the error contours) represent the a priori estimates of the 
error at each of the +/- standard deviations computed from the sketch's 
*getRankLowerBound(double, int)* and *getRankUpperBound(double, int)* methods.
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
+## Specific Configurations
+### Common Configuration for the following plots
+* K=50: the sketch sizing & accuracy parameter
+* HRA: High Rank Accuracy
+* Crit=LT: Comparison criterion: LT = Less-Than
+* SL=2^20: StreamLength
+* Eq Spaced: Equally spaced Plot Points (PP)
+* PP=100: Number of plot points on the x-axis
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
 
-<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
\ No newline at end of file
+### Plot 1 Adversarial Pattern: Sorted
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/SortedPattern.png" 
alt="/req/SortedPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Sorted.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Sorted.png" />
+
+### Plot 2 Adversarial Pattern: Reversed
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/ReversedPattern.png" 
alt="/req/ReversedPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Reversed.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Reversed.png" />
+
+### Plot 3 Adversarial Pattern: Random
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/RandomPattern.png" 
alt="/req/RandomPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Random.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Random.png" />
+
+### Plot 4 Adversarial Pattern: Zoomin
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/ZoominPattern.png" 
alt="/req/ZoominPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png" />
+
+### Plot 5 Adversarial Pattern: Zoomout
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/ZoomoutPattern.png" 
alt="/req/ZoomoutPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png" />
+
+### Plot 6 Adversarial Pattern: Sqrt
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/SqrtPattern.png" 
alt="/req/SqrtPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png" />
+
+### Plot 6 Adversarial Pattern: FlipFlop
+
+<img class="doc-img-qtr" src="{{site.docs_img_dir}}/req/FlipFlopPattern.png" 
alt="/req/FlipFlopPattern.png" />
+
+<img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png" 
alt="/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png" />
\ No newline at end of file
diff --git a/docs/REQ/ReqAccuracyRandomShuffled.md 
b/docs/REQ/ReqAccuracyRandomShuffled.md
index 9a5d652..086ae9e 100644
--- a/docs/REQ/ReqAccuracyRandomShuffled.md
+++ b/docs/REQ/ReqAccuracyRandomShuffled.md
@@ -20,7 +20,12 @@ layout: doc_page
     under the License.
 -->
 # ReqSketch Accuracy with Random Shuffled Streams
-This set of tests characterize the accuracy of the ReqSketch using random 
shuffled streams.  
+This set of tests characterize the accuracy (or more precisely the rank error) 
of the ReqSketch using random shuffled streams.  The goal of this suite of 
tests is to understand how the rank error of the sketch behaves across all 
ranks.  All of these tests are run with the same stream length. The two major 
parameters that are varied are the sketch's *K*, which affects size and 
accuracy of the sketch, and the *HRA / LRA* parameter, which selects the region 
of highest accuracy is the high r [...]
+
+These tests also confirm that the a priori prediction of the error bounds are 
reasonable and relatively conservative.  The computation of these bounds are 
based on empirical measurements derived from tests such as these and are 
subject to some tuning as we understand the sketch's error behavior over a 
wider selection of streams.
+
+For those that are interested in the actual code that run these tests can 
examine the following links.
+ 
 * 
[Code](https://github.com/apache/incubator-datasketches-characterization/blob/master/src/test/java/org/apache/datasketches/characterization/quantiles/ReqSketchAccuracyProfile.java):
 The code used to generate these characterization studies.
 * 
[Config](https://github.com/apache/incubator-datasketches-characterization/blob/master/src/main/resources/quantiles/ReqSketchAccuracyJob.conf):
 The human readable and editable configuration file that instructs the above 
code with the specific properties used to run the test. These configuration 
properties are different for each of the following plots and summarized below 
with each plot.
 
@@ -44,52 +49,38 @@ This set of tests characterize the accuracy of the 
ReqSketch using random shuffl
 
 
 ## Specific Configurations
-
-### Plot 1
-* K=12: the sketch sizing & accuracy parameter
+### Common Configuration for the following plots
 * SL=2^20: StreamLength
-* HRA: High Rank Accuracy
 * Eq Spaced: Equally spaced Plot Points (PP)
 * PP=100: Number of plot points on the x-axis
 * LgT=12: Number of trials = 2^12
-* Crit=LT: Comparison criterion: LT = Less-Than
 * Shuffled: Random shuffle of the input stream for each trial
 
+### Plot 1
+* K=12: the sketch sizing & accuracy parameter
+* HRA: High Rank Accuracy
+* Crit=LT: Comparison criterion: LT = Less-Than
+
 <img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK12SL20T12_LT_Sh.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
 
 ### Plot 2
 * K=12: the sketch sizing & accuracy parameter
-* SL=2^20: StreamLength
 * LRA: Low Rank Accuracy
-* Eq Spaced: Equally spaced Plot Points (PP)
-* PP=100: Number of plot points on the x-axis
-* LgT=12: Number of trials = 2^12
 * Crit=LE: Comparison criterion: LE = Less-Than or Equal
-* Shuffled: Random shuffle of the input stream for each trial
 
 <img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqLraK12SL20T12_LE_Sh.png" 
alt="/req/ReqErrEqLraK50SL20T12_LE_Sh.png" />
 
 ### Plot 3
 * K=50: the sketch sizing & accuracy parameter
-* SL=2^20: StreamLength
 * HRA: High Rank Accuracy
-* Eq Spaced: Equally spaced Plot Points (PP)
-* PP=100: Number of plot points on the x-axis
-* LgT=12: Number of trials = 2^12
 * Crit=LT: Comparison criterion: LT = Less-Than
-* Shuffled: Random shuffle of the input stream for each trial
 
 <img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqHraK50SL20T12_LT_Sh.png" 
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" />
 
 ### Plot 4
 * K=50: the sketch sizing & accuracy parameter
-* SL=2^20: StreamLength
 * LRA: Low Rank Accuracy
-* Eq Spaced: Equally spaced Plot Points (PP)
-* PP=100: Number of plot points on the x-axis
-* LgT=12: Number of trials = 2^12
 * Crit=LE: Comparison criterion: LE = Less-Than or Equal
-* Shuffled: Random shuffle of the input stream for each trial
 
 <img class="doc-img-full" 
src="{{site.docs_img_dir}}/req/ReqErrEqLraK50SL20T12_LE_Sh.png" 
alt="/req/ReqErrEqLraK50SL20T12_LE_Sh.png" />
 
diff --git a/docs/img/req/FlipFlopPattern.png b/docs/img/req/FlipFlopPattern.png
new file mode 100644
index 0000000..21763d6
Binary files /dev/null and b/docs/img/req/FlipFlopPattern.png differ
diff --git a/docs/img/req/RandomPattern.png b/docs/img/req/RandomPattern.png
new file mode 100644
index 0000000..b90b3f3
Binary files /dev/null and b/docs/img/req/RandomPattern.png differ
diff --git a/docs/img/req/ReversedPattern.png b/docs/img/req/ReversedPattern.png
new file mode 100644
index 0000000..163cd03
Binary files /dev/null and b/docs/img/req/ReversedPattern.png differ
diff --git a/docs/img/req/SortedPattern.png b/docs/img/req/SortedPattern.png
new file mode 100644
index 0000000..6cdb872
Binary files /dev/null and b/docs/img/req/SortedPattern.png differ
diff --git a/docs/img/req/SqrtPattern.png b/docs/img/req/SqrtPattern.png
new file mode 100644
index 0000000..a89d5e8
Binary files /dev/null and b/docs/img/req/SqrtPattern.png differ
diff --git a/docs/img/req/ZoominPattern.png b/docs/img/req/ZoominPattern.png
new file mode 100644
index 0000000..f3c40b7
Binary files /dev/null and b/docs/img/req/ZoominPattern.png differ
diff --git a/docs/img/req/ZoomoutPattern.png b/docs/img/req/ZoomoutPattern.png
new file mode 100644
index 0000000..d9cb057
Binary files /dev/null and b/docs/img/req/ZoomoutPattern.png differ


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to