Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Eelco Hoogendoorn
True, i suppose there is no harm in accumulating with max precision, and 
storing the result in the Original dtype, unless otherwise specified, although 
i wonder if the current nditer supports such behavior.

-Original Message-
From: Alan G Isaac alan.is...@gmail.com
Sent: ‎24-‎7-‎2014 18:09
To: Discussion of Numerical Python numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

On 7/24/2014 5:59 AM, Eelco Hoogendoorn wrote to Thomas:
 np.mean isn't broken; your understanding of floating point number is.


This comment seems to conflate separate issues:
the desirable return type, and the computational algorithm.
It is certainly possible to compute a mean of float32
doing reduction in float64 and still return a float32.
There is nothing implicit in the name `mean` that says
we have to just add everything up and divide by the count.

My own view is that `mean` would behave enough better
if computed as a running mean to justify the speed loss.
Naturally similar issues arise for `var` and `std`, etc.
See http://www.johndcook.com/standard_deviation.html
for some discussion and references.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Eelco Hoogendoorn
Inaccurate and utterly wrong are subjective. If You want To Be sufficiently 
strict,  floating point calculations are almost always 'utterly wrong'.

Granted, It would Be Nice if the docs specified the algorithm used. But numpy 
does not produce anything different than what a standard c loop or c++ std lib 
func would. This isn't a bug report, but rather a feature request. That said, 
support for fancy reduction algorithms would certainly be nice, if implementing 
it in numpy in a coherent manner is feasible. 

-Original Message-
From: Joseph Martinot-Lagarde joseph.martinot-laga...@m4x.org
Sent: ‎24-‎7-‎2014 20:04
To: numpy-discussion@scipy.org numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

Le 24/07/2014 12:55, Thomas Unterthiner a écrit :
 I don't agree. The problem is that I expect `mean` to do something
 reasonable. The documentation mentions that the results can be
 inaccurate, which is a huge understatement: the results can be utterly
 wrong. That is not reasonable. At the very least, a warning should be
 issued in cases where the dtype might not be appropriate.

Maybe the problem is the documentation, then. If this is a common error, 
it could be explicitly documented in the function documentation.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.mean still broken for large float32arrays

2014-07-24 Thread Alan G Isaac
On 7/24/2014 4:42 PM, Eelco Hoogendoorn wrote:
 This isn't a bug report, but rather a feature request.

I'm not sure statement this is correct.  The mean of a float32 array
can certainly be computed as a float32.  Currently this is
not necessarily what happens, not even approximately.
That feels a lot like a bug, even if we can readily understand
how the algorithm currently used produces it.  To say whether
it is a bug or not, don't we have to ask about the intent of `mean`?
If the intent is to sum and divide, then it is not a bug.
If the intent is to produce the mean, then it is a bug.

Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion