On 11/11/2011 04:45 PM, Andreas Müller wrote: > On 11/11/2011 04:38 PM, Olivier Grisel wrote: >> 2011/11/11 Gael Varoquaux<[email protected]>: >>> On Fri, Nov 11, 2011 at 04:11:46PM +0100, Andreas Müller wrote: >>>>> If you find that it does work/is useful on real problem, yes! >>>> I just started working on it. Atm I can get 3% error on MNIST using >>>> sklearn's SGD. >>> Does sound good. >> Doesn't a grid-searched gaussian SVM yield 2% or 1.5% test error on >> MNIST? If the CPU efficiency is much better than SVM then it's still >> good to have even if less accurate than kernel SVM. >> > Yes, 1.2% if you're really picky, I think. > I did some grid search on MNIST but couldn't get near that with > the approximate kernel. I went up to 5000 random features. > Maybe other parameters or more random features yield better > results. > I think speed is one of the main selling points, as well as using > datasets on which kernel methods are infeasible. > > >>> I find that one of the values of the scikit, and in particular its >>> mailing list, is that empirical knowlegde that comes to coding and trying >>> many methods. I am definitely exciting about the random features methods, >>> as well as the Chi2 one of your colleagues. >> +1 >> > Here is the paper describing the chi2 kernel: > http://sminchisescu.ins.uni-bonn.de/papers/lis_dagm10.pdf > It's pretty technical. There is matlab code though, that also implements > the RBF kernel: > http://sminchisescu.ins.uni-bonn.de/code/randfeat/ > Before you tried that, I suggest you use mine, which I hope is more > readable. > I'll make a branch later on.
If you're still interested in MNIST results: using gamma=0.03 and C=1 I get 0.9845 with SVC in 12 minutes (20GB kernel cache, don't know how much was used), with the same parameters on LinearSVC and 5000 sampled features I get 0.9783 in ~3 minutes. Going up to 20000 sampled features gives 0.9822 in ~10 minutes. That took ~20Gb of ram, though. I guess using SGD instead of LinearSVC could speed things up. I also included a small example on the digits dataset in my pull request. All in all, I think this is fun to play with and has potentially broad applicability. Cheers, Andy ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
