On Mon, 2013-07-15 at 08:47 -0600, Charles R Harris wrote: > > > On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg > <[email protected]> wrote: > On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote: > > > > > > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris > > <[email protected]> wrote: > > > > > <snip> > > > > > For nansum, I would expect 0 even in the > case of all > > nans. The point > > of these functions is to simply ignore nans, > correct? > > So I would aim > > for this behaviour: nanfunc(x) behaves the > same as > > func(x[~isnan(x)]) > > > > > > Agreed, although that changes current behavior. What > about the > > other cases? > > > > > > > > Looks like there isn't much interest in the topic, so I'll > just go > > ahead with the following choices: > > > > Non-NaN case > > > > 1) Empty array -> ValueError > > > > The current behavior with stats is an accident, i.e., the > nan arises > > from 0/0. I like to think that in this case the result is > any number, > > rather than not a number, so *the* value is simply not > defined. So in > > this case raise a ValueError for empty array. > > > > To be honest, I don't mind the current behaviour much sum([]) > = 0, > len([]) = 0, so it is in a way well defined. At least I am not > sure if I > would prefer always an error. I am a bit worried that just > changing it > might break code out there, such as plotting code where it > makes > perfectly sense to plot a NaN (i.e. nothing), but if that is > the case it > would probably be visible fast. > > I'm talking about mean, var, and std as statistics, sum isn't part of > that. If there is agreement that nansum of empty arrays/columns should > be zero I will do that. Note the sums of empty arrays may or may not > be empty. > > In [1]: ones((0, 3)).sum(axis=0) > Out[1]: array([ 0., 0., 0.]) > > In [2]: ones((3, 0)).sum(axis=0) > Out[2]: array([], dtype=float64) > > Which, sort of, makes sense. > > I think we can agree that the behaviour for reductions with an identity should default to returning the identity, including for the nanfuncs, i.e. sum([]) is 0, product([]) is 1...
Since mean = sum/length is a sensible definition, having 0/0 as a result doesn't seem to bad to me to be honest, it might be accidental but it is not a special case in the code ;). Though I don't mind an error as long as it doesn't break matplotlib or so. I agree about the nanfuncs raising an error would probably be more of a problem then for a usual ufunc, but still a bit hesitant about saying that it is ok too. I could imagine adding a very general "identity" argument (though I would not call it identity, because it is not the same as `np.add.identity`, just used in a place where that would be used otherwise): np.add.reduce([], identity=123) -> [123] np.add.reduce([1], identity=123) -> [1] np.nanmean([np.nan], identity=None) -> Error np.nanmean([np.nan], identity=np.nan) -> np.nan It doesn't really make sense, but: np.subtract.reduce([]) -> Error, since np.substract.identity is None np.subtract.reduce([], identity=0) -> 0, suppressing the error. I am not sure if I am convinced myself, but especially for the nanfuncs it could maybe provide a way to circumvent the problem somewhat. Including functions such as np.nanargmin, whose result type does not even support NaN. Plus it gives an argument allowing for warnings about changing behaviour. - Sebastian > > > 2) ddof >= n -> ValueError > > > > If the number of elements, n, is not zero and ddof >= n, > raise a > > ValueError for the ddof value. > > > > Makes sense to me, especially for ddof > n. Just returning nan > in all > cases for backward compatibility would be fine with me too. > > > Nan case > > > > 1) Empty array -> Value Error > > 2) Empty slice -> NaN > > 3) For slice ddof >= n -> Nan > > > > Personally I would somewhat prefer if 1) and 2) would at least > default > to the same thing. But I don't use the nanfuncs anyway. I was > wondering > about adding the option for the user to pick what the fill is > (and i.e. > if it is None (maybe default) -> ValueError). We could also > allow this > for normal reductions without an identity, but I am not sure > if it is > useful there. > > > Chuck > > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
