On Oct 16, 2017, at 2:27 PM, Ismael Lemhadri <lemha...@stanford.edu
<mailto:lemha...@stanford.edu>> wrote:
@Andreas Muller:
My references do not assume centering,
e.g. http://ufldl.stanford.edu/wiki/index.php/PCA
any reference?
On Mon, Oct 16, 2017 at 10:20 AM, <scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>> wrote:
Send scikit-learn mailing list submissions to
scikit-learn@python.org <mailto:scikit-learn@python.org>
To subscribe or unsubscribe via the World Wide Web, visit
https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
or, via email, send a message with subject or body 'help' to
scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>
You can reach the person managing the list at
scikit-learn-ow...@python.org
<mailto:scikit-learn-ow...@python.org>
When replying, please edit your Subject line so it is more specific
than "Re: Contents of scikit-learn digest..."
Today's Topics:
1. Re: unclear help file for sklearn.decomposition.pca
(Andreas Mueller)
----------------------------------------------------------------------
Message: 1
Date: Mon, 16 Oct 2017 13:19:57 -0400
From: Andreas Mueller <t3k...@gmail.com <mailto:t3k...@gmail.com>>
To: scikit-learn@python.org <mailto:scikit-learn@python.org>
Subject: Re: [scikit-learn] unclear help file for
sklearn.decomposition.pca
Message-ID: <04fc445c-d8f3-a3a9-4ab2-0535826a2...@gmail.com
<mailto:04fc445c-d8f3-a3a9-4ab2-0535826a2...@gmail.com>>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
The definition of PCA has a centering step, but no scaling step.
On 10/16/2017 11:16 AM, Ismael Lemhadri wrote:
> Dear Roman,
> My concern is actually not about not mentioning the scaling but
about
> not mentioning the centering.
> That is, the sklearn PCA removes the mean but it does not
mention it
> in the help file.
> This was quite messy for me to debug as I expected it to either: 1/
> center and scale simultaneously or / not scale and not center
either.
> It would be beneficial to explicit the behavior in the help
file in my
> opinion.
> Ismael
>
> On Mon, Oct 16, 2017 at 8:02 AM,
<scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>
> <mailto:scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>>> wrote:
>
> Send scikit-learn mailing list submissions to
> scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
> or, via email, send a message with subject or body 'help' to
> scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>
> <mailto:scikit-learn-requ...@python.org
<mailto:scikit-learn-requ...@python.org>>
>
> You can reach the person managing the list at
> scikit-learn-ow...@python.org
<mailto:scikit-learn-ow...@python.org>
<mailto:scikit-learn-ow...@python.org
<mailto:scikit-learn-ow...@python.org>>
>
> When replying, please edit your Subject line so it is more
specific
> than "Re: Contents of scikit-learn digest..."
>
>
> Today's Topics:
>
> ? ?1. unclear help file for sklearn.decomposition.pca (Ismael
> Lemhadri)
> ? ?2. Re: unclear help file for sklearn.decomposition.pca
> ? ? ? (Roman Yurchak)
> ? ?3. Question about LDA's coef_ attribute (Serafeim Loukas)
> ? ?4. Re: Question about LDA's coef_ attribute (Alexandre
Gramfort)
> ? ?5. Re: Question about LDA's coef_ attribute (Serafeim
Loukas)
>
>
>
----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 15 Oct 2017 18:42:56 -0700
> From: Ismael Lemhadri <lemha...@stanford.edu
<mailto:lemha...@stanford.edu>
> <mailto:lemha...@stanford.edu <mailto:lemha...@stanford.edu>>>
> To: scikit-learn@python.org
<mailto:scikit-learn@python.org> <mailto:scikit-learn@python.org
<mailto:scikit-learn@python.org>>
> Subject: [scikit-learn] unclear help file for
> ? ? ? ? sklearn.decomposition.pca
> Message-ID:
> ? ? ? ?
>
<CANpSPFTgv+Oz7f97dandmrBBayqf_o9w=18okhcfn0u5dnz...@mail.gmail.com
<mailto:18okhcfn0u5dnzj%...@mail.gmail.com>
> <mailto:18okhcfn0u5dnzj%...@mail.gmail.com
<mailto:18okhcfn0u5dnzj%25...@mail.gmail.com>>>
> Content-Type: text/plain; charset="utf-8"
>
> Dear all,
> The help file for the PCA class is unclear about the
preprocessing
> performed to the data.
> You can check on line 410 here:
>
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/>
> decomposition/pca.py#L410
>
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/%0Adecomposition/pca.py#L410>>
> that the matrix is centered but NOT scaled, before
performing the
> singular
> value decomposition.
> However, the help files do not make any mention of it.
> This is unclear for someone who, like me, just wanted to
compare
> that the
> PCA and np.linalg.svd give the same results. In academic
settings,
> students
> are often asked to compare different methods and to check that
> they yield
> the same results. I expect that many students have
confronted this
> problem
> before...
> Best,
> Ismael Lemhadri
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171015/c465bde7/attachment-0001.html>>>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 16 Oct 2017 15:16:45 +0200
> From: Roman Yurchak <rth.yurc...@gmail.com
<mailto:rth.yurc...@gmail.com>
> <mailto:rth.yurc...@gmail.com <mailto:rth.yurc...@gmail.com>>>
> To: Scikit-learn mailing list <scikit-learn@python.org
<mailto:scikit-learn@python.org>
> <mailto:scikit-learn@python.org
<mailto:scikit-learn@python.org>>>
> Subject: Re: [scikit-learn] unclear help file for
> ? ? ? ? sklearn.decomposition.pca
> Message-ID: <b2abdcfd-4736-929e-6304-b93832932...@gmail.com
<mailto:b2abdcfd-4736-929e-6304-b93832932...@gmail.com>
> <mailto:b2abdcfd-4736-929e-6304-b93832932...@gmail.com
<mailto:b2abdcfd-4736-929e-6304-b93832932...@gmail.com>>>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Ismael,
>
> as far as I saw the sklearn.decomposition.PCA doesn't mention
> scaling at
> all (except for the whiten parameter which is
post-transformation
> scaling).
>
> So since it doesn't mention it, it makes sense that it
doesn't do any
> scaling of the input. Same as np.linalg.svd.
>
> You can verify that PCA and np.linalg.svd yield the same
results, with
>
> ```
> ?>>> import numpy as np
> ?>>> from sklearn.decomposition import PCA
> ?>>> import numpy.linalg
> ?>>> X = np.random.RandomState(42).rand(10, 4)
> ?>>> n_components = 2
> ?>>> PCA(n_components, svd_solver='full').fit_transform(X)
> ```
>
> and
>
> ```
> ?>>> U, s, V = np.linalg.svd(X - X.mean(axis=0),
full_matrices=False)
> ?>>> (X - X.mean(axis=0)).dot(V[:n_components].T)
> ```
>
> --
> Roman
>
> On 16/10/17 03:42, Ismael Lemhadri wrote:
> > Dear all,
> > The help file for the PCA class is unclear about the
preprocessing
> > performed to the data.
> > You can check on line 410 here:
> >
>
https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>
> >
>
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>
>
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410
<https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/decomposition/pca.py#L410>>>
> > that the matrix is centered but NOT scaled, before
performing the
> > singular value decomposition.
> > However, the help files do not make any mention of it.
> > This is unclear for someone who, like me, just wanted to
compare
> that
> > the PCA and np.linalg.svd give the same results. In academic
> settings,
> > students are often asked to compare different methods and to
> check that
> > they yield the same results. I expect that many students have
> confronted
> > this problem before...
> > Best,
> > Ismael Lemhadri
> >
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
> > https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
> >
>
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 16 Oct 2017 15:27:48 +0200
> From: Serafeim Loukas <seral...@gmail.com
<mailto:seral...@gmail.com> <mailto:seral...@gmail.com
<mailto:seral...@gmail.com>>>
> To: scikit-learn@python.org
<mailto:scikit-learn@python.org> <mailto:scikit-learn@python.org
<mailto:scikit-learn@python.org>>
> Subject: [scikit-learn] Question about LDA's coef_ attribute
> Message-ID: <58c6d0da-9de5-4ef5-97c1-48159831f...@gmail.com
<mailto:58c6d0da-9de5-4ef5-97c1-48159831f...@gmail.com>
> <mailto:58c6d0da-9de5-4ef5-97c1-48159831f...@gmail.com
<mailto:58c6d0da-9de5-4ef5-97c1-48159831f...@gmail.com>>>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear Scikit-learn community,
>
> Since the documentation of the LDA
>
(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>
>
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>>)
> is not so clear, I would like to ask if the lda.coef_ attribute
> stores the eigenvectors from the SVD decomposition.
>
> Thank you in advance,
> Serafeim
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/4263df5c/attachment-0001.html>>>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 16 Oct 2017 16:57:52 +0200
> From: Alexandre Gramfort <alexandre.gramf...@inria.fr
<mailto:alexandre.gramf...@inria.fr>
> <mailto:alexandre.gramf...@inria.fr
<mailto:alexandre.gramf...@inria.fr>>>
> To: Scikit-learn mailing list <scikit-learn@python.org
<mailto:scikit-learn@python.org>
> <mailto:scikit-learn@python.org
<mailto:scikit-learn@python.org>>>
> Subject: Re: [scikit-learn] Question about LDA's coef_
attribute
> Message-ID:
> ? ? ? ?
>
<cadeotzricoqhuhjmmw2z14cqffeqyndyoxn-ogkavtmq7v0...@mail.gmail.com
<mailto:cadeotzricoqhuhjmmw2z14cqffeqyndyoxn-ogkavtmq7v0...@mail.gmail.com>
>
<mailto:cadeotzricoqhuhjmmw2z14cqffeqyndyoxn-ogkavtmq7v0...@mail.gmail.com
<mailto:cadeotzricoqhuhjmmw2z14cqffeqyndyoxn-ogkavtmq7v0...@mail.gmail.com>>>
> Content-Type: text/plain; charset="UTF-8"
>
> no it stores the direction of the decision function to
match the
> API of
> linear models.
>
> HTH
> Alex
>
> On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> <seral...@gmail.com <mailto:seral...@gmail.com>
<mailto:seral...@gmail.com <mailto:seral...@gmail.com>>> wrote:
> > Dear Scikit-learn community,
> >
> > Since the documentation of the LDA
> >
>
(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
> > is not so clear, I would like to ask if the lda.coef_
attribute
> stores the
> > eigenvectors from the SVD decomposition.
> >
> > Thank you in advance,
> > Serafeim
> >
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
> > https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
> >
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 16 Oct 2017 17:02:46 +0200
> From: Serafeim Loukas <seral...@gmail.com
<mailto:seral...@gmail.com> <mailto:seral...@gmail.com
<mailto:seral...@gmail.com>>>
> To: Scikit-learn mailing list <scikit-learn@python.org
<mailto:scikit-learn@python.org>
> <mailto:scikit-learn@python.org
<mailto:scikit-learn@python.org>>>
> Subject: Re: [scikit-learn] Question about LDA's coef_
attribute
> Message-ID: <413210d2-56ae-41a4-873f-d171bb365...@gmail.com
<mailto:413210d2-56ae-41a4-873f-d171bb365...@gmail.com>
> <mailto:413210d2-56ae-41a4-873f-d171bb365...@gmail.com
<mailto:413210d2-56ae-41a4-873f-d171bb365...@gmail.com>>>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear Alex,
>
> Thank you for the prompt response.
>
> Are the eigenvectors stored in some variable ?
> Does the lda.scalings_ attribute contain the eigenvectors ?
>
> Best,
> Serafeim
>
> > On 16 Oct 2017, at 16:57, Alexandre Gramfort
> <alexandre.gramf...@inria.fr
<mailto:alexandre.gramf...@inria.fr>
<mailto:alexandre.gramf...@inria.fr
<mailto:alexandre.gramf...@inria.fr>>>
> wrote:
> >
> > no it stores the direction of the decision function to
match the
> API of
> > linear models.
> >
> > HTH
> > Alex
> >
> > On Mon, Oct 16, 2017 at 3:27 PM, Serafeim Loukas
> <seral...@gmail.com <mailto:seral...@gmail.com>
<mailto:seral...@gmail.com <mailto:seral...@gmail.com>>> wrote:
> >> Dear Scikit-learn community,
> >>
> >> Since the documentation of the LDA
> >>
>
(http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>
>
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html
<http://scikit-learn.org/stable/modules/generated/sklearn.discriminant_analysis.LinearDiscriminantAnalysis.html>>)
> >> is not so clear, I would like to ask if the lda.coef_
attribute
> stores the
> >> eigenvectors from the SVD decomposition.
> >>
> >> Thank you in advance,
> >> Serafeim
> >>
> >> _______________________________________________
> >> scikit-learn mailing list
> >> scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
> >> https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
> >>
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
> > https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>
>
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/505c7da3/attachment.html>>>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org <mailto:scikit-learn@python.org>
<mailto:scikit-learn@python.org <mailto:scikit-learn@python.org>>
> https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
> <https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>>
>
>
> ------------------------------
>
> End of scikit-learn Digest, Vol 19, Issue 25
> ********************************************
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org <mailto:scikit-learn@python.org>
> https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html
<http://mail.python.org/pipermail/scikit-learn/attachments/20171016/f47e63a9/attachment.html>>
------------------------------
Subject: Digest Footer
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org <mailto:scikit-learn@python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
<https://mail.python.org/mailman/listinfo/scikit-learn>
------------------------------
End of scikit-learn Digest, Vol 19, Issue 28
********************************************
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org <mailto:scikit-learn@python.org>
https://mail.python.org/mailman/listinfo/scikit-learn