Hi Srinivas, Nice catch. I agree that it would be better if variance was defined in terms of n-1 rather than n. This seems like an easy fix to get started with SymPy if you'd like to try. There is a wiki page providing tips for the Development Workflow <https://github.com/sympy/sympy/wiki/Development-workflow> if you're not already familiar with git and such.
If you're interested in improving the statistics functionality in SymPy let me know. This has been a project of mine. Best, -Matt On Tue, Sep 13, 2011 at 10:06 PM, Srinivas <[email protected]> wrote: > Hi, > I wanted to join as a new developer of sympy, so I am looking > through the code to get familiar with it. For /sympy/sympy/statistics/ > distributions.py, the Sample class defines the variance to be: > s.variance = sum([(x-mean)**2 for x in s]) / Integer(len(s)) > > But, this would be the biased estimator. My question is would/should > this class use the unbiased estimator (replacing Integer(len(s)) with > Integer(len(s)-1))? > > > Thanks > > -- > You received this message because you are subscribed to the Google Groups > "sympy" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/sympy?hl=en. > > -- You received this message because you are subscribed to the Google Groups "sympy" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
