On Wed, Apr 28, 2010 at 11:56 AM, T J <[email protected]> wrote: > On Mon, Apr 26, 2010 at 10:03 AM, Charles R Harris > <[email protected]> wrote: > > > > > > On Mon, Apr 26, 2010 at 10:55 AM, Charles R Harris > > <[email protected]> wrote: > >> > >> Hi All, > >> > >> We need to make a decision for ticket #1123 regarding what nansum should > >> return when all values are nan. At some earlier point it was zero, but > >> currently it is nan, in fact it is nan whatever the operation is. That > is > >> consistent, simple and serves to mark the array or axis as containing > all > >> nans. I would like to close the ticket and am a bit inclined to go with > the > >> current behaviour although there is an argument to be made for returning > 0 > >> for the nansum case. Thoughts? > >> > > > > To add a bit of context, one could argue that the results should be > > consistent with the equivalent operations on empty arrays and always be > > non-nan. > > > > In [1]: nansum([]) > > Out[1]: nan > > > > In [2]: sum([]) > > Out[2]: 0.0 > > > > This seems like an obvious one to me. What is the spirit of nansum? > > """ > Return the sum of array elements over a given axis treating > Not a Numbers (NaNs) as zero. > """ > > Okay. So NaNs in an array are treated as zeros and the sum is > performed as one normally would perform it starting with an initial > sum of zero. So if all values are NaN, then we add nothing to our > original sum and still return 0. > > I'm not sure I understand the argument that it should return NaN. It > is counter to the *purpose* of nansum. Also, if one wants to > determine if all values in an array are NaN, isn't there another way? > Let's keep (or make) those distinct operations, as they are definitely > distinct concepts. > __ >
It looks like the consensus is that zero should be returned. This is a change from current behaviour and that bothers me a bit. Here are some other oddities In [6]: nanmax([nan]) Out[6]: nan In [7]: nanargmax([nan]) Out[7]: nan In [8]: nanargmax([1]) Out[8]: 0 So it looks like the current behaviour is very much tilted towards nans as missing data flags. I think we should just leave that as is with perhaps a note in the docs to that effect. The decision here should probably accommodate the current users of these functions, of which I am not one. If we leave the current behaviour as is then I think the rest of the nan functions need fixes to return nan for empty sequences as nansum is the only one that currently does that. Chuck
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
