[
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774946#comment-17774946
]
Alex Herbert commented on STATISTICS-71:
----------------------------------------
Updated the tests to use a common base test in commit:
c8d6e07107bc1643a059a8a45b7c7bfc785bd954
This reuses the DoubleTolerance interface from the test suite in the
distribution package. This allows specifying a tolerance for equality using
ULP, relative error or absolute error; and combinations of them.
There are 4 main tests given an array of values:
# Compute the statistic from single values in an updating algorithm (e.g. via
a DoubleStream).
# Compute the statistic from an array of values (allows multi-pass algorithms).
# Divide the values into groups; Compute multiple statistic instances for each
group using an updating algorithm; and then combine instances (e.g. via a
parallel DoubleStream).
# Divide the values into groups; Compute multiple statistic instances for each
group using an array of values; and then combine instances (e.g. via a parallel
Stream<double[]>).
Added reference data from other implementations in:
c7fc6d6a991a4b330b59f367f31d331f5d53c15b
Note that the tests use a random order for the input data values. This requires
setting the tolerance to be more lenient than would be required for a fixed
input. The current tolerances are stable after 1 week of testing. They may
require updating if sporadic failures are observed.
> Implementation of Univariate Statistics
> ---------------------------------------
>
> Key: STATISTICS-71
> URL: https://issues.apache.org/jira/browse/STATISTICS-71
> Project: Commons Statistics
> Issue Type: Task
> Components: descriptive
> Reporter: Anirudh Joshi
> Assignee: Anirudh Joshi
> Priority: Minor
> Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required
> for the updated SummaryStatistics API.
> The implementation would be "storeless". It should be used for calculating
> statistics that can be computed in one pass through the data without storing
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
> DoubleStorelessUnivariateStatistic add(double v);
> long getCount();
> void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)