<josef.p...@gmail.com> wrote: > pandas came later and thought ddof=1 is worth more than consistency.
Pandas is a data analysis package. NumPy is a numerical array package. I think ddof=1 is justified for Pandas, for consistency with statistical software (SPSS et al.) For NumPy, there are many computational tasks where the Bessel correction is not wanted, so providing a uncorrected result is the correct thing to do. NumPy should be a low-level array library that does very little magic. Those who need the Bessel correction can multiply with sqrt(n/float(n-1)) or specify ddof. Bu that belongs in the docs. Sturla P.S. Personally I am not convinced "unbiased" is ever a valid argument, as the biased estimator has smaller error. This is from experience in marksmanship: I'd rather shoot a tight series with small systematic error than scatter my bullets wildly but "unbiased" on the target. It is the total error that counts. The series with smallest total error gets the best score. It is better to shoot two series and calibrate the sight in between than use a calibration-free sight that don't allow us to aim. That's why I think classical statistics got this one wrong. Unbiased is never a virtue, but the smallest error is. Thus, if we are to repeat an experiment, we should calibrate our estimator just like a marksman calibrates his sight. But the aim should always be calibrated to give the smallest error, not an unbiased scatter. Noone in their right mind would claim a shotgun is more precise than a rifle because it has smaller bias. But that is what applying the Bessel correction implies. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion