Mean.evaluate() should use a two-pass algorithm
-----------------------------------------------

                 Key: MATH-174
                 URL: https://issues.apache.org/jira/browse/MATH-174
             Project: Commons Math
          Issue Type: Bug
    Affects Versions: 1.1, 1.0
            Reporter: Phil Steitz
            Priority: Minor
             Fix For: 1.2


Since it has access to the full array of stored data, Mean.evaluate(double[]) 
can improve its accuracy by executing a two-pass algorithm, first computing an 
initial estimate using the definitional forumla and then correcting that value 
by the mean deviation against that value.  The attached patch makes the 
correction and includes an algorithm reference.  It also re-activates and 
increases sensitivity in some of the certified data tests.  The Michelson data 
test fails with the current implementation, showing a difference in the 13th 
significant digit.

This change will improve accuracy in DescriptiveStatistics.getMean and also in 
Variance.evaluate() and DescriptiveStatistics.getVariance().  There is a cost 
associated with the change - as it roughly doubles the arithmetic operations 
required to compute the mean.  Since it only applies to the "stored array" 
implementation and the implementation will be pluggable in 1.2,  my preference 
is to move ahead with this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to