Indeed it sounds interesting but I'd still be curious as to how it
compares against elasticnet.
On 07/29/2015 05:41 PM, Stylianos Kampakis wrote:
Hi Andreas,
Sure. Actually, the purpose of the model is both regularization and
dimensionality reduction for problems where the number of features can
be larger than the number of instances (or in any case when there is a
large number of features). It is particularly effective when there are
lots of highly correlated attributes with each other.
L1 regularization breaks down in the presence of lots of correlations.
L2 deals better with this problem, but ignores the presence of
clusters of highly correlated attributes. Supervised PCA is
particularly well suited to these kinds of problems. The algorithm
seems to outperform partial least squares.
I actually came up upon this algorithm when trying to find a way to
analyze GPS data gathered from the training of a professional football
team. Ridge logistic regression didn't provide good results, LASSO
either, but supervised PCA worked well. It is also possible to use it
to reduce the dimensionality in a way that the components correlate
with the response.
The work was presented at Mathsports International 2015
(http://www.mathsportinternational2015.com/uploads/2/2/2/4/22242920/mathsport2015proceedings.pdf)
I am not sure about the popularity of this method, in general, but for
me it's going to be one of the standard methods to use in the presence
of lots of variables.
Best regards,
Stelios
2015-07-28 19:16 GMT+01:00 Andreas Mueller <t3k...@gmail.com
<mailto:t3k...@gmail.com>>:
Hi Stylianos.
Can you give a bit more background on the model?
It seems fairly well-cited but I haven't really seen it in practice.
Is it still state of the art?
The main purpose seems to be a particular type of regularization,
right, not supervised dimensionality reduction?
How does this compare against elastic net? There seems to be some
comparison to PLS and lasso in the paper.
It would be good to see that this is a widely useful method before
adding it to sklearn.
Cheers,
Andy
On 07/24/2015 06:40 AM, Stylianos Kampakis wrote:
Dear all,
I am thinking to contribute a new model to the library: The
supervised principal components analysis by Bair et al. (2006).
I wanted to get in touch before contributing to make sure no-one
else is working on that algorithm, since this is what the site
recommends.
Cheers,
S. Kampakis
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general