On Thu, Nov 22, 2012 at 7:14 AM, Sebastian Berg <[email protected]> wrote: > On Wed, 2012-11-21 at 22:58 -0500, [email protected] wrote: >> On Wed, Nov 21, 2012 at 10:35 PM, Charles R Harris >> <[email protected]> wrote: >> > >> > >> > On Wed, Nov 21, 2012 at 7:45 PM, <[email protected]> wrote: >> >> >> >> On Wed, Nov 21, 2012 at 9:22 PM, Olivier Delalleau <[email protected]> wrote: >> >> > Current behavior looks sensible to me. I personally would prefer no >> >> > warning >> >> > but I think it makes sense to have one as it can be helpful to detect >> >> > issues >> >> > faster. >> >> >> >> I agree that nan should be the correct answer. >> >> (I gave up trying to define a default for 0/0 in scipy.stats ttests.) >> >> >> >> some funnier cases >> >> >> >> >>> np.var([1], ddof=1) >> >> 0.0 >> > >> > >> > This one is a nan in development. >> > >> >> >> >> >>> np.var([1], ddof=5) >> >> -0 >> >> >>> np.var([1,2], ddof=5) >> >> -0.16666666666666666 >> >> >>> np.std([1,2], ddof=5) >> >> nan >> >> >> > >> > These still do this. Also >> > >> > In [10]: var([], ddof=1) >> > Out[10]: -0 >> > >> > Which suggests that the nan is pretty much an accidental byproduct of >> > division by zero. I think it might make sense to have a definite policy for >> > these corner cases. >> >> It would also be consistent with the usual pattern to raise a >> ValueError on this. ddof too large, size too small. >> It wouldn't be the case that for some columns or rows we get valid >> answers in this case, as long as we don't allow for missing values. >> > > It seems to me that nan is the reasonable result for these operations > (reduce like operations that do not have an identity). Though actually > reduce operations without an identity throw a ValueError (ie. > `np.minimum.reduce([])`), but then mean/std/var seem special enough to > be different from other reduce operations (for example their result is > always floating point). As for usability I think for example when > plotting errorbars using std, it would be rather annoying to get a > ValueError, so if anything the reduce machinery could give more special > results for empty floating point reductions. > > In any case the warning should be clearer and for too large ddof's I > would say it should return nan+Warning as well.
Why don't operations on empty arrays not return empty arrays? but this looks ok >>> (np.array([]) - np.array([]).mean()) / np.array([]).std() array([], dtype=float64) >>> (np.array([]) - np.array([]).mean()) / np.array([]).std(0) array([], dtype=float64) >>> (np.array([]) - np.array([]).mean(0)) / np.array([]).std(0) array([], dtype=float64) >>> (np.array([]) - np.array([]).mean(0)) / np.array([]) array([], dtype=float64) >>> np.array([[]]) - np.expand_dims(np.array([[]]).mean(1),1) array([], shape=(1, 0), dtype=float64) >>> np.array([[]]) - np.expand_dims(np.array([]),1) array([], shape=(0, 0), dtype=float64) >>> np.array([]) - np.expand_dims(np.array([]),0) array([], shape=(1, 0), dtype=float64) (But I doubt I will rely in many cases on correct "calculations" with empty arrays.) Josef > > Sebastian > >> >> quick check with np.ma >> >> looks correct except when delegating to numpy ? >> >> >>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=5, axis=0) >> >>> s >> masked_array(data = [-- --], >> mask = [ True True], >> fill_value = 1e+20) >> >> >>> s = np.ma.var(np.ma.masked_invalid([[1.,2],[1,np.nan]]), ddof=1, axis=0) >> >>> s >> masked_array(data = [0.0 --], >> mask = [False True], >> fill_value = 1e+20) >> >> >>> s = np.ma.std([1,2], ddof=5) >> >>> s >> masked >> >>> type(s) >> <class 'numpy.ma.core.MaskedConstant'> >> >> >>> np.ma.var([1,2], ddof=5) >> -0.16666666666666666 >> >> >> Josef >> >> > >> > <snip> >> > >> > Chuck >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > [email protected] >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
