Re: [Numpy-discussion] numpy.mean still broken for large float32arrays
True, i suppose there is no harm in accumulating with max precision, and storing the result in the Original dtype, unless otherwise specified, although i wonder if the current nditer supports such behavior. -Original Message- From: Alan G Isaac alan.is...@gmail.com Sent: 24-7-2014 18:09 To: Discussion of Numerical Python numpy-discussion@scipy.org Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays On 7/24/2014 5:59 AM, Eelco Hoogendoorn wrote to Thomas: np.mean isn't broken; your understanding of floating point number is. This comment seems to conflate separate issues: the desirable return type, and the computational algorithm. It is certainly possible to compute a mean of float32 doing reduction in float64 and still return a float32. There is nothing implicit in the name `mean` that says we have to just add everything up and divide by the count. My own view is that `mean` would behave enough better if computed as a running mean to justify the speed loss. Naturally similar issues arise for `var` and `std`, etc. See http://www.johndcook.com/standard_deviation.html for some discussion and references. Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.mean still broken for large float32arrays
Inaccurate and utterly wrong are subjective. If You want To Be sufficiently strict, floating point calculations are almost always 'utterly wrong'. Granted, It would Be Nice if the docs specified the algorithm used. But numpy does not produce anything different than what a standard c loop or c++ std lib func would. This isn't a bug report, but rather a feature request. That said, support for fancy reduction algorithms would certainly be nice, if implementing it in numpy in a coherent manner is feasible. -Original Message- From: Joseph Martinot-Lagarde joseph.martinot-laga...@m4x.org Sent: 24-7-2014 20:04 To: numpy-discussion@scipy.org numpy-discussion@scipy.org Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays Le 24/07/2014 12:55, Thomas Unterthiner a écrit : I don't agree. The problem is that I expect `mean` to do something reasonable. The documentation mentions that the results can be inaccurate, which is a huge understatement: the results can be utterly wrong. That is not reasonable. At the very least, a warning should be issued in cases where the dtype might not be appropriate. Maybe the problem is the documentation, then. If this is a common error, it could be explicitly documented in the function documentation. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.mean still broken for large float32arrays
On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote: This isn't a bug report, but rather a feature request. I'm not sure statement this is correct. The mean of a float32 array can certainly be computed as a float32. Currently this is not necessarily what happens, not even approximately. That feels a lot like a bug, even if we can readily understand how the algorithm currently used produces it. To say whether it is a bug or not, don't we have to ask about the intent of `mean`? If the intent is to sum and divide, then it is not a bug. If the intent is to produce the mean, then it is a bug. Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion