> On 28 May 2019, at 18:09, Eric Barnhill <[email protected]> wrote:
>
> The previous commons-math interface for descriptive statistics used a
> paradigm of constructing classes for various statistical functions and
> calling evaluate(). Example
>
> Mean mean = new Mean();
> double mn = mean.evaluate(double[])
>
> I wrote this type of code all through grad school and always found it
> unnecessarily bulky. To me these summary statistics are classic use cases
> for static methods:
>
> double mean .= Mean.evaluate(double[])
>
> I don't have any particular problem with the evaluate() syntax.
>
> I looked over the old Math 4 API to see if there were any benefits to the
> previous class-oriented approach that we might not want to lose. But I
> don't think there were, the functionality outside of evaluate() is minimal.
A quick check shows that evaluate comes from UnivariateStatistic. This has some
more methods that add little to an instance view of the computation:
double evaluate(double[] values) throws MathIllegalArgumentException;
double evaluate(double[] values, int begin, int length) throws
MathIllegalArgumentException;
UnivariateStatistic copy();
However it is extended by StorelessUnivariateStatistic which adds methods to
update the statistic:
void increment(double d);
void incrementAll(double[] values) throws MathIllegalArgumentException;
void incrementAll(double[] values, int start, int length) throws
MathIllegalArgumentException;
double getResult();
long getN();
void clear();
StorelessUnivariateStatistic copy();
This type of functionality would be lost by static methods.
If you are moving to a functional interface type pattern for each statistic
then you will lose the other functionality possible with an instance state,
namely updating with more values or combining instances.
So this is a question of whether updating a statistic is required after the
first computation.
Will there be an alternative in the library for a map-reduce type operation
using instances that can be combined using Stream.collect:
<R> R collect(Supplier<R> supplier,
ObjDoubleConsumer<R> accumulator,
BiConsumer<R, R> combiner);
Here <R> would be Mean:
double mean = Arrays.stream(new double[1000]).collect(Mean::new, Mean::add,
Mean::add).getMean() with:
void add(double);
void add(Mean);
double getMean();
(Untested code)
>
> Finally we should consider whether we really need a separate class for each
> statistic at all. Do we want to call:
>
> Mean.evaluate()
>
> or
>
> SummaryStats.mean()
>
> or maybe
>
> Stats.mean() ?
>
> The last being nice and compact.
>
> Let's make a decision so our esteemed mentee Virendra knows in what
> direction to take his work this summer. :)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]