Actually, from the numpy docs, the ddof=1 for np.std doesn't make it
unbiased. There's a whole wikipedia article on calculating the unbiased
standard deviation, and it seems to be different for the normal
distribution than for others and involves the gamma function--the advice
from the wiki is not to worry about it.

http://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation

However, it seems that some people define standardization as having a zero
mean and unit _variance_, which numpy actually supports and is unbiased for
iid samples. So maybe dividing by the variance and giving the flags
with_var='population', 'sample', or None is the better solution.

Wikipedia's article on feature scaling defines it as zero-mean and unit
variance, but then gives the advice to divide by the standard deviation.
Dividing by std seems like the wrong advice.

http://en.wikipedia.org/wiki/Feature_scaling

Doug




On Tue, Nov 6, 2012 at 7:11 AM, Lars Buitinck <l.j.buiti...@uva.nl> wrote:

> 2012/11/6 Olivier Grisel <olivier.gri...@ensta.org>:
> > None, False: no stdev
> > True, "pop": population stdev
> > "sample": sample stdev
> >
> > +1 but with "population" instead of "pop".
>
> Alright :)
>
> --
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
>
>
> ------------------------------------------------------------------------------
> LogMeIn Central: Instant, anywhere, Remote PC access and management.
> Stay in control, update software, and manage PCs from one command center
> Diagnose problems and improve visibility into emerging IT issues
> Automate, monitor and manage. Do more in less time with Central
> http://p.sf.net/sfu/logmein12331_d2d
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to