On Fri, Apr 4, 2014 at 8:50 AM, Daπid <davidmen...@gmail.com> wrote: > > On 2 April 2014 16:06, Sturla Molden <sturla.mol...@gmail.com> wrote: >> >> <josef.p...@gmail.com> wrote: >> >> > pandas came later and thought ddof=1 is worth more than consistency. >> >> Pandas is a data analysis package. NumPy is a numerical array package. >> >> I think ddof=1 is justified for Pandas, for consistency with statistical >> software (SPSS et al.) >> >> For NumPy, there are many computational tasks where the Bessel correction >> is not wanted, so providing a uncorrected result is the correct thing to >> do. NumPy should be a low-level array library that does very little magic. > > > All this discussion reminds me of the book "Numerical Recipes": > > "if the difference between N and N − 1 ever matters to you, then you > are probably up to no good anyway — e.g., trying to substantiate a > questionable > hypothesis with marginal data." > > For any reasonably sized data set, it is a correction in the second > significant figure.
I fully agree, but sometimes you don't have much choice. `big data` == `statistics with negative degrees of freedom` ? or maybe `machine learning` == `statistics with negative degrees of freedom` ? Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion