Re: [Scikit-learn-general] Code contribution: Supervised PCA

Andreas Mueller Wed, 29 Jul 2015 14:47:24 -0700

Indeed it sounds interesting but I'd still be curious as to how itcompares against elasticnet.


On 07/29/2015 05:41 PM, Stylianos Kampakis wrote:

Hi Andreas,
Sure. Actually, the purpose of the model is both regularization anddimensionality reduction for problems where the number of features canbe larger than the number of instances (or in any case when there is alarge number of features). It is particularly effective when there arelots of highly correlated attributes with each other.
L1 regularization breaks down in the presence of lots of correlations.L2 deals better with this problem, but ignores the presence ofclusters of highly correlated attributes. Supervised PCA isparticularly well suited to these kinds of problems. The algorithmseems to outperform partial least squares.
I actually came up upon this algorithm when trying to find a way toanalyze GPS data gathered from the training of a professional footballteam. Ridge logistic regression didn't provide good results, LASSOeither, but supervised PCA worked well. It is also possible to use itto reduce the dimensionality in a way that the components correlatewith the response.
The work was presented at Mathsports International 2015(http://www.mathsportinternational2015.com/uploads/2/2/2/4/22242920/mathsport2015proceedings.pdf)
I am not sure about the popularity of this method, in general, but forme it's going to be one of the standard methods to use in the presenceof lots of variables.
Best regards,
Stelios
2015-07-28 19:16 GMT+01:00 Andreas Mueller <t3k...@gmail.com<mailto:t3k...@gmail.com>>:
    Hi Stylianos.

    Can you give a bit more background on the model?
    It seems fairly well-cited but I haven't really seen it in practice.
    Is it still state of the art?
    The main purpose seems to be a particular type of regularization,
    right, not supervised dimensionality reduction?
    How does this compare against elastic net? There seems to be some
    comparison to PLS and lasso in the paper.

    It would be good to see that this is a widely useful method before
    adding it to sklearn.

    Cheers,
    Andy



    On 07/24/2015 06:40 AM, Stylianos Kampakis wrote:
    Dear all,

    I am thinking to contribute a new model to the library: The
    supervised principal components analysis by Bair et al. (2006).

    I wanted to get in touch before contributing to make sure no-one
    else is working on that algorithm, since this is what the site
    recommends.

    Cheers,
    S. Kampakis


    
------------------------------------------------------------------------------


    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net  
<mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    
------------------------------------------------------------------------------

    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Code contribution: Supervised PCA

Reply via email to