Re: [Scikit-learn-general] Random features for approximate kernel maps

Andreas Müller Mon, 14 Nov 2011 05:53:56 -0800

On 11/11/2011 04:45 PM, Andreas Müller wrote:
> On 11/11/2011 04:38 PM, Olivier Grisel wrote:
>> 2011/11/11 Gael Varoquaux<[email protected]>:
>>> On Fri, Nov 11, 2011 at 04:11:46PM +0100, Andreas Müller wrote:
>>>>> If you find that it does work/is useful on real problem, yes!
>>>> I just started working on it. Atm I can get 3% error on MNIST using
>>>> sklearn's SGD.
>>> Does sound good.
>> Doesn't a grid-searched gaussian SVM yield 2% or 1.5% test error on
>> MNIST? If the CPU efficiency is much better than SVM then it's still
>> good to have even if less accurate than kernel SVM.
>>
> Yes, 1.2% if you're really picky, I think.
> I did some grid search on MNIST but couldn't get near that with
> the approximate kernel. I went up to 5000 random features.
> Maybe other parameters or more random features yield better
> results.
> I think speed is one of the main selling points, as well as using
> datasets on which kernel methods are infeasible.
>
>
>>> I find that one of the values of the scikit, and in particular its
>>> mailing list, is that empirical knowlegde that comes to coding and trying
>>> many methods. I am definitely exciting about the random features methods,
>>> as well as the Chi2 one of your colleagues.
>> +1
>>
> Here is the paper describing the chi2 kernel:
> http://sminchisescu.ins.uni-bonn.de/papers/lis_dagm10.pdf
> It's pretty technical. There is matlab code though, that also implements
> the RBF kernel:
> http://sminchisescu.ins.uni-bonn.de/code/randfeat/
> Before you tried that, I suggest you use mine, which I hope is more
> readable.
> I'll make a branch later on.


If you're still interested in MNIST results:
using gamma=0.03 and C=1 I get 0.9845 with SVC in 12 minutes
(20GB kernel cache, don't know how much was used),
with the same parameters on LinearSVC and 5000 sampled features
I get 0.9783 in ~3 minutes.
Going up to 20000 sampled features gives 0.9822 in ~10 minutes.
That took ~20Gb of ram, though. I guess using SGD instead of LinearSVC
could speed things up.

I also included a small example on the digits dataset in my pull request.

All in all, I think this is fun to play with and has potentially broad 
applicability.

Cheers,
Andy

------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Random features for approximate kernel maps

Reply via email to