It might be useful to have some of these comments in the docs.
Currently the PCA docsting only states that PCA is computed with SVD and
then goes on discussing randomized SVD solvers. The user guide is not
more helpful on this subject either,
Ismael opened a documentation PR on it in
https:/
Oh, never mind my previous email, because while the components should be the
same, the projection of the data points onto those components would still be
affected by centering vs non-centering I guess.
Best,
Sebastian
> On Oct 16, 2017, at 3:25 PM, Sebastian Raschka wrote:
>
> Hi,
>
> if you
Hi,
if you compute the principal components (i.e., eigendecomposition) from the
covariance matrix, it shouldn't matter whether the data is centered or not,
since the covariance matrix is computed as
CovMat = \fact{1}{n} \sum_{i=1}^{n} (x_n - \bar{x}) (x_n - \bar{x})^T
where \bar{x} = vector o
Dear Ismael,
PCA should always involve at the least centering, or, if the variables are to
contribute equally, scaling. Here is a reference from the scientific area named
"chemometrics". In Chemometrics PCA used not only for dimensionality reduction,
but also for interpretation of variance by
On 10/16/2017 02:27 PM, Ismael Lemhadri wrote:
@Andreas Muller:
My references do not assume centering, e.g.
http://ufldl.stanford.edu/wiki/index.php/PCA
any reference?
It kinda does but is not very clear about it:
This data has already been pre-processed so that each of the
features\texts
Your document says:
> This data has already been pre-processed so that each of the features
and have about the same mean (zero) and variance.
This means that you do this before doing the eigendecomposition.
Check the wikipedia article
https://en.wikipedia.org/wiki/Principal_component_analysis
@Andreas Muller:
My references do not assume centering, e.g.
http://ufldl.stanford.edu/wiki/index.php/PCA
any reference?
On Mon, Oct 16, 2017 at 10:20 AM, wrote:
> Send scikit-learn mailing list submissions to
> scikit-learn@python.org
>
> To subscribe or unsubscribe via the World Wide