[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

Alex Herbert (Jira) Fri, 13 Oct 2023 08:15:04 -0700


    [ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774946#comment-17774946
 ]


Alex Herbert commented on STATISTICS-71:
----------------------------------------

Updated the tests to use a common base test in commit:

c8d6e07107bc1643a059a8a45b7c7bfc785bd954

This reuses the DoubleTolerance interface from the test suite in the 
distribution package. This allows specifying a tolerance for equality using 
ULP, relative error or absolute error; and combinations of them.

There are 4 main tests given an array of values:
 # Compute the statistic from single values in an updating algorithm (e.g. via 
a DoubleStream).
 # Compute the statistic from an array of values (allows multi-pass algorithms).
 # Divide the values into groups; Compute multiple statistic instances for each 
group using an updating algorithm; and then combine instances (e.g. via a 
parallel DoubleStream).
 # Divide the values into groups; Compute multiple statistic instances for each 
group using an array of values; and then combine instances (e.g. via a parallel 
Stream<double[]>).

 

Added reference data from other implementations in:

c7fc6d6a991a4b330b59f367f31d331f5d53c15b

 

Note that the tests use a random order for the input data values. This requires 
setting the tolerance to be more lenient than would be required for a fixed 
input. The current tolerances are stable after 1 week of testing. They may 
require updating if sporadic failures are observed.

 

> Implementation of Univariate Statistics
> ---------------------------------------
>
>                 Key: STATISTICS-71
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-71
>             Project: Commons Statistics
>          Issue Type: Task
>          Components: descriptive
>            Reporter: Anirudh Joshi
>            Assignee: Anirudh Joshi
>            Priority: Minor
>              Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
>     DoubleStorelessUnivariateStatistic add(double v);
>     long getCount();
>     void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

Reply via email to