Hi,
Here is a preliminary results on classification performance of KPLS using a
20 fold cross validation with random splits of 0.5 train and 0.5 test for
the digits dataset using SVC, linearSVC and KPLS. I used the same kernel
parameters (rbf, gamma=0.001) of this example for SVC and KPLS:
http://scikit-learn.org/0.11/auto_examples/plot_digits_classification.html.
and the default 'C' parameter for both SVC and linearSVC
+------+----------------+-----------+----------------+----------+----------------+------------------+
| Idx | KPLS_acc | KPLS_time | SVC_acc | SVC_time |
linearSVC_acc | linearSVC_time |
+------+----------------+-----------+----------------+----------+----------------+------------------+
| 0.0 | 0.988876529477 | 0.25 | 0.987764182425 | 0.17 |
0.926585094549 | 0.0 |
| 1.0 | 0.988876529477 | 0.24 | 0.986651835373 | 0.18 |
0.927697441602 | 0.0 |
| 2.0 | 0.986651835373 | 0.25 | 0.989988876529 | 0.19 |
0.929922135706 | 0.0 |
| 3.0 | 0.989988876529 | 0.25 | 0.98553948832 | 0.05 |
0.939933259177 | 0.0 |
| 4.0 | 0.988876529477 | 0.24 | 0.988876529477 | 0.04 |
0.931034482759 | 0.00999999999999 |
| 5.0 | 0.992213570634 | 0.32 | 0.991101223582 | 0.2 |
0.923248053393 | 0.0 |
| 6.0 | 0.991101223582 | 0.26 | 0.988876529477 | 0.04 |
0.949944382647 | 0.0 |
| 7.0 | 0.994438264739 | 0.26 | 0.988876529477 | 0.04 |
0.937708565072 | 0.0 |
| 8.0 | 0.986651835373 | 0.25 | 0.984427141268 | 0.18 |
0.943270300334 | 0.01 |
| 9.0 | 0.988876529477 | 0.24 | 0.987764182425 | 0.05 |
0.925472747497 | 0.0 |
| 10.0 | 0.992213570634 | 0.23 | 0.993325917686 | 0.18 |
0.933259176863 | 0.0 |
| 11.0 | 0.994438264739 | 0.23 | 0.991101223582 | 0.18 |
0.928809788654 | 0.0 |
| 12.0 | 0.987764182425 | 0.24 | 0.978865406007 | 0.18 |
0.923248053393 | 0.0 |
| 13.0 | 0.98553948832 | 0.25 | 0.981090100111 | 0.19 |
0.929922135706 | 0.0 |
| 14.0 | 0.994438264739 | 0.25 | 0.989988876529 | 0.05 |
0.943270300334 | 0.0 |
| 15.0 | 0.986651835373 | 0.25 | 0.987764182425 | 0.18 |
0.927697441602 | 0.00999999999999 |
| 16.0 | 0.986651835373 | 0.34 | 0.986651835373 | 0.22 |
0.943270300334 | 0.0 |
| 17.0 | 0.989988876529 | 0.25 | 0.987764182425 | 0.17 |
0.941045606229 | 0.0 |
| 18.0 | 0.991101223582 | 0.21 | 0.988876529477 | 0.17 |
0.946607341491 | 0.0 |
| 19.0 | 0.992213570634 | 0.25 | 0.98553948832 | 0.17 |
0.929922135706 | 0.0 |
| mean | 0.9898 | 0.253 | 0.9875 | 0.1415 |
0.9340 | 0.0015 |
+------+----------------+-----------+----------------+----------+----------------+------------------+
I am currently cleaning the code to put it in a public gist, I will tell
you when it is there.
Regards,
Abdalrahman Eweiwi
On Mon, Dec 2, 2013 at 3:21 PM, Olivier Grisel <olivier.gri...@ensta.org>wrote:
> 2013/12/2 abdalrahman eweiwi <abdalrahman.ewe...@gmail.com>:
> > Hi,
> >
> > You are right, infact I spent almost 1 month reviewing the code base of
> PLS
> > and CCA implementation in sklearn. I should say that the (old) code base
> in
> > my opinion should be somehow refactored to get into a simpler shape. I
> > remember I had some difficulties in analyzing that code. Also the CCA
> > results from sklearn was not right in a couple of applications I tested
> it
> > with. Anyway, I sat down and rewrote my own code for PLS,CCA,KPLS which I
> > use frequently in my applications, and they are fine. I think I should
> now
> > evaluate it on a couple of datasets as Oliver has suggested and show you
> > the results. If you have any advise on how to deal with the current
> codebase
> > to integrate my code, I would be glad to listen.
>
> Please feel free to send a link to your current implementation if it's
> already online (e.g. on http://gist.github.com ) so that Nelle and
> other interested developers can have a look at it to decide how to
> best fix / refactor / replace the existing codebase.
>
> Writing benchmark script that compare the two implementations is
> helpful allow with cases that highlight incorrect results from the
> sklearn implementation.
>
> If you do so, please make sure to run an updated master branch of sklearn.
>
> If you open issues to report bugs for the current implementation,
> please mention @NelleV in the description or in the comment so that
> she will receive a notification as AFAIK she is the dev who worked the
> most recently on this part of the code base.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics
> Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general