On Tue, Apr 1, 2014 at 9:02 PM, Sturla Molden <[email protected]> wrote:
> Haslwanter Thomas <[email protected]> wrote:
>
>> Personally I cannot think of many applications where it would be desired
>> to calculate the standard deviation with ddof=0. In addition, I feel that
>> there should be consistency between standard modules such as numpy, scipy, 
>> and pandas.
>
> ddof=0 is the maxiumum likelihood estimate. It is also needed in Bayesian
> estimation.

It's true, but the counter-arguments are also strong. And regardless
of whether ddof=1 or ddof=0 is better, surely the same one is better
for both numpy and scipy.

> If you are not eatimating from a sample, but rather calculating for the
> whole population, you always want ddof=0.
>
> What does Matlab do by default? (Yes, it is a retorical question.)

R (which is probably a more relevant comparison) does do ddof=1 by default.

>> I am wondering if there is a good reason to stick to "ddof=0" as the
>> default for "std", or if others would agree with my suggestion to change
>> the default to "ddof=1"?
>
> It is a bad idea to suddenly break everyone's code.

It would be a disruptive transition, but OTOH having inconsistencies
like this guarantees the ongoing creation of new broken code.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to