On Tue, Apr 1, 2014 at 9:02 PM, Sturla Molden <[email protected]> wrote: > Haslwanter Thomas <[email protected]> wrote: > >> Personally I cannot think of many applications where it would be desired >> to calculate the standard deviation with ddof=0. In addition, I feel that >> there should be consistency between standard modules such as numpy, scipy, >> and pandas. > > ddof=0 is the maxiumum likelihood estimate. It is also needed in Bayesian > estimation.
It's true, but the counter-arguments are also strong. And regardless of whether ddof=1 or ddof=0 is better, surely the same one is better for both numpy and scipy. > If you are not eatimating from a sample, but rather calculating for the > whole population, you always want ddof=0. > > What does Matlab do by default? (Yes, it is a retorical question.) R (which is probably a more relevant comparison) does do ddof=1 by default. >> I am wondering if there is a good reason to stick to "ddof=0" as the >> default for "std", or if others would agree with my suggestion to change >> the default to "ddof=1"? > > It is a bad idea to suddenly break everyone's code. It would be a disruptive transition, but OTOH having inconsistencies like this guarantees the ongoing creation of new broken code. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
