Re: [scikit-learn] unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Andreas Mueller
   5. Re: Question about LDA's coef_ attribute (Serafeim Loukas) -- Message: 1 Date: Sun, 15 Oct 2017 18:42:56 -0700 From: Ismael Lemhadri mailto:lemha...@stanford.edu>> To: scikit-learn@python.org <mailto:scikit-learn@python.org> Subject: [scikit-learn] unclear help

Re: [scikit-learn] unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Roman Yurchak
On 16/10/17 17:16, Ismael Lemhadri wrote: My concern is actually not about not mentioning the scaling but about not mentioning the centering. That is, the sklearn PCA removes the mean but it does not mention it in the help file. I think it's currently assumed given the definition of the PCA, bu

Re: [scikit-learn] unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Ismael Lemhadri
- > > Message: 1 > Date: Sun, 15 Oct 2017 18:42:56 -0700 > From: Ismael Lemhadri > To: scikit-learn@python.org > Subject: [scikit-learn] unclear help file for > sklearn.decomposition.pca > Message-ID

Re: [scikit-learn] unclear help file for sklearn.decomposition.pca

2017-10-16 Thread Roman Yurchak
Ismael, as far as I saw the sklearn.decomposition.PCA doesn't mention scaling at all (except for the whiten parameter which is post-transformation scaling). So since it doesn't mention it, it makes sense that it doesn't do any scaling of the input. Same as np.linalg.svd. You can verify that

[scikit-learn] unclear help file for sklearn.decomposition.pca

2017-10-15 Thread Ismael Lemhadri
Dear all, The help file for the PCA class is unclear about the preprocessing performed to the data. You can check on line 410 here: https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/ decomposition/pca.py#L410 that the matrix is centered but NOT scaled, before performing the singula