[
https://issues.apache.org/jira/browse/MATH-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100581#comment-14100581
]
david cogen commented on MATH-1146:
-----------------------------------
Phil Steitz says "The updating formulas and setup are designed for fast,
accurate computations and I want to make sure we do not add a performance hit
for users with standard (non NaN, non-Inf) data so we can return meaningful
results in the presence of INFs. "
I fail to see how the current implementation of Mean could be faster than the
definition way: maintaining a sum and dividing by the number of items.
The current implementation maintains a current average, rather than computing
the average from the sum when requested. So the number of divide operations is
proportional to N rather than being equal to 1 for the definition way.
(Assuming the normal use case of processing a bunch of values then getting the
mean at the end. Whereas the current implementation is only an optimization if
somebody wants to know the mean often - like after processing each input.)
> class Mean returns incorrect result after processing an Infinity value
> ----------------------------------------------------------------------
>
> Key: MATH-1146
> URL: https://issues.apache.org/jira/browse/MATH-1146
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 3.3
> Reporter: david cogen
> Attachments: MATH-1146.patch
>
>
> 1. Create a Mean object.
> 2. call increment() with Double.POSITIVE_INFINITY.
> 3. Call getResult(). Result is INFINITY as expected.
> 4. call increment() with 0.
> 5. Call getResult(). Result is NaN; not INFINITY as expected.
> This is apparently due to the "optimization" for calculating mean described
> in the javadoc. Rather than accumulating a sum, it maintains a running mean
> value using the formula "m = m + (new value - m) / (number of observations)",
> which unlike the "definition way", fails after an infinity.
> I was using Mean within a SummaryStatistics. Other statistics also seem to be
> affected; for example, the standard deviation also incorrectly gives NaN
> rather than Infinity. I don't know if that's due to the error in Mean or if
> the other stats classes have similar bugs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)