psteitz 2004/03/02 18:32:25
Modified: math/xdocs/userguide stat.xml
Log:
Filled in missing content in univariate statistics section.
Revision Changes Path
1.10 +92 -11 jakarta-commons/math/xdocs/userguide/stat.xml
Index: stat.xml
===================================================================
RCS file: /home/cvs/jakarta-commons/math/xdocs/userguide/stat.xml,v
retrieving revision 1.9
retrieving revision 1.10
diff -u -r1.9 -r1.10
--- stat.xml 29 Feb 2004 21:25:08 -0000 1.9
+++ stat.xml 3 Mar 2004 02:32:25 -0000 1.10
@@ -57,7 +57,7 @@
all statistics, consists of <code>evaluate()</code> methods that take
double[] arrays as arguments and return
the value of the statistic. This interface is extended by
<a
href="../apidocs/org/apache/commons/math/stat/univariate/StorelessUnivariateStatistic.html">
- org.apache.commons.math.stat.univariate.StorelessUnivariateStatistic,</a>
which adds <code>increment(),</code>
+ StorelessUnivariateStatistic,</a> which adds <code>increment(),</code>
<code>getResult()</code> and associated methods to support "storageless"
implementations that
maintain counters, sums or other state information as values are added
using the <code>increment()</code>
method.
@@ -65,29 +65,110 @@
<p>
Abstract implementations of the top level interfaces are provided in
<a
href="../apidocs/org/apache/commons/math/stat/univariate/AbstractUnivariateStatistic.html">
- org.apache.commons.math.stat.univariate.AbstractUnivariateStatistic</a>
and
+ AbstractUnivariateStatistic</a> and
<a
href="../apidocs/org/apache/commons/math/stat/univariate/AbstractStorelessUnivariateStatistic.html">
-
org.apache.commons.math.stat.univariate.AbstractStorelessUnivariateStatistic</a>
respectively.
+ AbstractStorelessUnivariateStatistic</a> respectively.
</p>
<p>
Each statistic is implemented as a separate class, in one of the
subpackages (moment, rank, summary) and
each extends one of the abstract classes above (depending on whether or
not value storage is required to
compute the statistic).
There are several ways to instantiate and use statistics. Statistics can
be instantiated and used directly, but it is
- generally more convenient to access them using the provided aggregates:
+ generally more convenient (and efficient) to access them using the
provided aggregates, <a
href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
+ DescriptiveStatistics</a> and <a
href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
+ SummaryStatistics.</a> <code>DescriptiveStatistics</code> maintains
the input data in memory and has the capability
+ of producing "rolling" statistics computed from a "window" consisting
of the most recently added values. <code>SummaryStatisics</code>
+ does not store the input data values in memory, so the statistics
included in this aggregate are limited to those that can be
+ computed in one pass through the data without access to the full array
of values.
+ </p>
+ <p>
<table>
- <tr><th>Aggregate</th><th>Statistics Included</th><th>Values
stored?</th></tr>
+ <tr><th>Aggregate</th><th>Statistics Included</th><th>Values
stored?</th><th>"Rolling" capability?</th></tr>
<tr><td><a
href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
-
org.apache.commons.math.stat.DescriptiveStatistics</a></td><td>All</td><td>Yes</td></tr>
+ DescriptiveStatistics</a></td><td>min, max, mean, geometric mean, n,
sum, sum of squares, standard deviation, variance, percentiles, skewness, kurtosis,
median</td><td>Yes</td><td>Yes</td></tr>
<tr><td><a
href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
- org.apache.commons.math.stat.SummaryStatistics</a></td><td>min, max,
mean, geometric mean, n, sum, sum of squares, standard deviation,
variance</td><td>No</td></tr>
+ SummaryStatistics</a></td><td>min, max, mean, geometric mean, n, sum,
sum of squares, standard deviation, variance</td><td>No</td><td>No</td></tr>
</table>
- TODO: add code sample
+ </p>
+ <p>
There is also a utility class, <a
href="../apidocs/org/apache/commons/math/stat/StatUtils.html">
- org.apache.commons.math.stat.StatUtils,</a> that provides static methods
for computing statistics
- from double[] arrays.
+ StatUtils,</a> that provides static methods for computing statistics
+ directly from double[] arrays.
</p>
+ <p>
+ Here are some examples showing how to compute univariate statistics.
+ <dl>
+ <dt>Compute summary statistics for a list of double values</dt>
+ <br></br>
+ <dd>Using the <code>DescriptiveStatistics</code> aggregate (values are
stored in memory):
+ <source>
+// Get a DescriptiveStatistics instance using factory method
+DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
+
+// Add the data from the array
+for( int i = 0; i < inputArray.length; i++) {
+ stats.addValue(inputArray[i]);
+}
+
+// Compute some statistics
+double mean = stats.getMean();
+double std = stats.getStandardDeviation();
+double median = stats.getMedian();
+ </source>
+ </dd>
+ <dd>Using the <code>SummaryStatistics</code> aggregate (values are
<strong>not</strong> stored in memory):
+ <source>
+// Get a SummaryStatistics instance using factory method
+SummaryStatistics stats = SummaryStatistics.newInstance();
+
+// Read data from an input stream, adding values and updating sums, counters, etc.
necessary for stats
+while (line != null) {
+ line = in.readLine();
+ stats.addValue(Double.parseDouble(line.trim()));
+}
+in.close();
+
+// Compute the statistics
+double mean = stats.getMean();
+double std = stats.getStandardDeviation();
+//double median = stats.getMedian(); <-- NOT AVAILABLE in SummaryStatistics
+ </source>
+ </dd>
+ <dd>Using the <code>StatUtils</code> utility class:
+ <source>
+// Compute statistics directly from the array -- assume values is a double[] array
+double mean = StatUtils.mean(values);
+double std = StatUtils.variance(values);
+double median = StatUtils.percentile(50);
+// Compute the mean of the first three values in the array
+mean = StatuUtils.mean(values, 0, 3);
+ </source>
+ </dd>
+ <dt>Maintain a "rolling mean" of the most recent 100 values from an input
stream</dt>
+ <br></br>
+ <dd>Use a <code>DescriptiveStatistics</code> instance with window size set
to 100
+ <source>
+// Create a DescriptiveStats instance and set the window size to 100
+DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
+stats.setWindowSize(100);
+// Read data from an input stream, displaying the mean of the most recent 100
observations
+// after every 100 observations
+long nLines = 0;
+while (line != null) {
+ line = in.readLine();
+ stats.addValue(Double.parseDouble(line.trim()));
+ if (nLines == 100) {
+ nLines = 0;
+ System.out.println(stats.getMean()); // "rolling" mean of most
recent 100 values
+ }
+}
+in.close();
+ </source>
+ </dd>
+ </dl>
+ </p>
</subsection>
+
<subsection name="1.3 Frequency distributions" href="frequency">
<p>This is yet to be written. Any contributions will be gratefully
accepted!</p>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]