[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

Anirudh Joshi (Jira) Sun, 02 Jul 2023 09:18:05 -0700


    [ 
https://issues.apache.org/jira/browse/STATISTICS-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17739388#comment-17739388
 ]


Anirudh Joshi commented on STATISTICS-71:
-----------------------------------------

{quote}[...] What about {{Count}} being a {{{}DoubleStorelessStatistics{}}}, on 
which other(s) could depend?
{quote}
I was thinking the same to avoid redundant computations while computing 
multiple statistics. We could implement count as a standalone statistic and use 
composition to avoid redundant computations while computing multiple statistics 
?


DoubleStorelessUnivariateStatistic add(double d);
{quote}{{[...] What's the intended usage?}}{{ }}
{quote}
{{The reason I included this signature is to possibly support chaining during 
the add calls like}}
{code:java}
Mean m = new Mean();
double mean = m.add(1).add(2).add(3).getAsDouble();

double mean = Stream.of(1.0, 2.0, 3.0).map(Mean::add).getAsDouble();{code}
{quote}
[...] E.g. is "Storeless" a required part of the name? Or is it an 
"implementation detail"?
{quote}
I changed the interface name to DoubleStorelessUnivariateStatistic since I feel 
we might actually need 3 interfaces, IntStorelessSummaryStatistics for integer 
data and LongSummaryStatistics for long data, similar to JDK SummaryStatistics. 
I feel its better to have Storeless as part of the interface name to make it 
clear that it is a storeless implementation, so that users are aware that they 
cannot do certain things like compute rolling statistics for instance. But I do 
not have a strong opinion on this and curious to hear other arguments against 
the naming (e.g. if the name is too verbose)


{quote}
[...] be more restrictive in order to forbid meaningless combinations?
{quote}

Not sure if I understand the requirement correctly. What are the kinds of 
combinations we want to restrict here ?

> Implementation of Univariate Statistics
> ---------------------------------------
>
>                 Key: STATISTICS-71
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-71
>             Project: Commons Statistics
>          Issue Type: Task
>          Components: descriptive
>            Reporter: Anirudh Joshi
>            Priority: Minor
>              Labels: gsoc, gsoc2023
>
> Jira ticket to track the implementation of the Univariate statistics required 
> for the updated SummaryStatistics API. 
> The implementation would be "storeless". It should be used for calculating 
> statistics that can be computed in one pass through the data without storing 
> the sample values.
> Currently I have the definition of API as (this might evolve as I continue 
> working)
> {code:java}
> public interface DoubleStorelessUnivariateStatistic extends DoubleSupplier {
>     DoubleStorelessUnivariateStatistic add(double v);
>     long getCount();
>     void combine(DoubleStorelessUnivariateStatistic other);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (STATISTICS-71) Implementation of Univariate Statistics

Reply via email to