Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Roman Yurchak
It might be useful to have some of these comments in the docs. Currently the PCA docsting only states that PCA is computed with SVD and then goes on discussing randomized SVD solvers. The user guide is not more helpful on this subject either, Ismael opened a documentation PR on it in https:/

Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Sebastian Raschka
Oh, never mind my previous email, because while the components should be the same, the projection of the data points onto those components would still be affected by centering vs non-centering I guess. Best, Sebastian > On Oct 16, 2017, at 3:25 PM, Sebastian Raschka wrote: > > Hi, > > if you

Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Sebastian Raschka
Hi, if you compute the principal components (i.e., eigendecomposition) from the covariance matrix, it shouldn't matter whether the data is centered or not, since the covariance matrix is computed as CovMat = \fact{1}{n} \sum_{i=1}^{n} (x_n - \bar{x}) (x_n - \bar{x})^T where \bar{x} = vector o

Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Oliver Tomic
Dear Ismael, PCA should always involve at the least centering, or, if the variables are to contribute equally, scaling. Here is a reference from the scientific area named "chemometrics". In Chemometrics PCA used not only for dimensionality reduction, but also for interpretation of variance by

Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Andreas Mueller
On 10/16/2017 02:27 PM, Ismael Lemhadri wrote: @Andreas Muller: My references do not assume centering, e.g. http://ufldl.stanford.edu/wiki/index.php/PCA any reference? It kinda does but is not very clear about it: This data has already been pre-processed so that each of the features\texts

Re: [scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Michael Eickenberg
Your document says: > This data has already been pre-processed so that each of the features and have about the same mean (zero) and variance. This means that you do this before doing the eigendecomposition. Check the wikipedia article https://en.wikipedia.org/wiki/Principal_component_analysis

[scikit-learn] 1. Re: unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Ismael Lemhadri
@Andreas Muller: My references do not assume centering, e.g. http://ufldl.stanford.edu/wiki/index.php/PCA any reference? On Mon, Oct 16, 2017 at 10:20 AM, wrote: > Send scikit-learn mailing list submissions to > scikit-learn@python.org > > To subscribe or unsubscribe via the World Wide