Olivier, the histogram plotting and data transformation is great, valuable
practical advice that would be nice to have in the docs. I think it would
go nicely as part of a tutorial, what do you think?
Vlad
On Fri, Jan 11, 2013 at 10:38 AM, Olivier Grisel
<olivier.gri...@ensta.org>wrote:
> 2013/1/11 <paul.czodrow...@merckgroup.com>:
> >
> > BTW: When doing a RandomizedPCA, the explained variance of the first
> > component increase to 78%
> > * Turning whiten on or off has more or less no influence on the explained
> > variance.
> >
> > * However, plotting with class labels on => again no clear
> differentiation
> > between the two classes :(
>
> It just means that you data is not linearly separable when you project
> it onto the first 2 dimensions of PCA.
>
> This is no big deal though. Not all problems are as easy as iris
> classification :)
>
> What you can also try is plot the histograms for each features. For
> feature that are highly non gaussian (e.g. with a long tail), you
> should try to take a sublinear scaling of them: `sign(x_i) *
> np.log1p(x_i)` instead of `x_i` or alternatively `sign(x_i) *
> np.sqrt(x_i)`. If the histogram shows a multimodal profile then maybe
> percentile binning would help too.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
> much more. Get web development skills now with LearnDevNow -
> 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
> SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122812
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general