On 11/11/2011 04:38 PM, Olivier Grisel wrote: > 2011/11/11 Gael Varoquaux<[email protected]>: >> On Fri, Nov 11, 2011 at 04:11:46PM +0100, Andreas Müller wrote: >>>> If you find that it does work/is useful on real problem, yes! >>> I just started working on it. Atm I can get 3% error on MNIST using >>> sklearn's SGD. >> Does sound good. > Doesn't a grid-searched gaussian SVM yield 2% or 1.5% test error on > MNIST? If the CPU efficiency is much better than SVM then it's still > good to have even if less accurate than kernel SVM. > Yes, 1.2% if you're really picky, I think. I did some grid search on MNIST but couldn't get near that with the approximate kernel. I went up to 5000 random features. Maybe other parameters or more random features yield better results. I think speed is one of the main selling points, as well as using datasets on which kernel methods are infeasible.
>> I find that one of the values of the scikit, and in particular its >> mailing list, is that empirical knowlegde that comes to coding and trying >> many methods. I am definitely exciting about the random features methods, >> as well as the Chi2 one of your colleagues. > +1 > Here is the paper describing the chi2 kernel: http://sminchisescu.ins.uni-bonn.de/papers/lis_dagm10.pdf It's pretty technical. There is matlab code though, that also implements the RBF kernel: http://sminchisescu.ins.uni-bonn.de/code/randfeat/ Before you tried that, I suggest you use mine, which I hope is more readable. I'll make a branch later on. Cheers, Andy ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
