On 11/11/2011 04:38 PM, Olivier Grisel wrote:
> 2011/11/11 Gael Varoquaux<[email protected]>:
>> On Fri, Nov 11, 2011 at 04:11:46PM +0100, Andreas Müller wrote:
>>>> If you find that it does work/is useful on real problem, yes!
>>> I just started working on it. Atm I can get 3% error on MNIST using
>>> sklearn's SGD.
>> Does sound good.
> Doesn't a grid-searched gaussian SVM yield 2% or 1.5% test error on
> MNIST? If the CPU efficiency is much better than SVM then it's still
> good to have even if less accurate than kernel SVM.
>
Yes, 1.2% if you're really picky, I think.
I did some grid search on MNIST but couldn't get near that with
the approximate kernel. I went up to 5000 random features.
Maybe other parameters or more random features yield better
results.
I think speed is one of the main selling points, as well as using
datasets on which kernel methods are infeasible.


>> I find that one of the values of the scikit, and in particular its
>> mailing list, is that empirical knowlegde that comes to coding and trying
>> many methods. I am definitely exciting about the random features methods,
>> as well as the Chi2 one of your colleagues.
> +1
>
Here is the paper describing the chi2 kernel:
http://sminchisescu.ins.uni-bonn.de/papers/lis_dagm10.pdf
It's pretty technical. There is matlab code though, that also implements 
the RBF kernel:
http://sminchisescu.ins.uni-bonn.de/code/randfeat/
Before you tried that, I suggest you use mine, which I hope is more 
readable.
I'll make a branch later on.

Cheers,
Andy

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to