Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Mathieu Blondel
I completely agree that the current API goes against the philosophy of SVMs (sparse solutions). When kernel="precomputed", we should accept both n_test x n_train and n_test x n_sv arrays in predict. Can you file a bug report? In the mean time, you can use a n_test x n_train matrix but compute only

Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Matt Henderson
Hi, That's true, in a way I may as well precompute them all- because I never know which ones are going to come up. It would make sense though in general not to require such a big calculation. Cheers, Matt On 9 November 2011 11:07, Andreas Müller wrote: > ** > Hi Matt. > I had a similar setup

Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Andreas Müller
Hi Matt. I had a similar setup as you did once. As my kernel was very slow, it helped a lot - though I precomputed all kernel values. I'm pretty sure the underlying libsvm supports only providing the kernel values at the support vectors. I'm not sure why this is not supported by sklearn at the mome

Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Gael Varoquaux
On Wed, Nov 09, 2011 at 11:25:17AM +0100, Andreas Müller wrote: > Here is an example: > https://gist.github.com/1351047 > It seems that at the moment, you have to use the whole training set to > generate the kernel at test time. > Not sure why, maybe for ease of use. > Can anyone comment on that?

Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Matt Henderson
Hi Andy, Thanks for the example. I actually started experimenting with defining my own python function kernel, which caches its results so that it is fast once it has already been called once with the same input. (Useful since I am training on the same data multiple classifiers and comparing dif

Re: [Scikit-learn-general] Caching dataset

2011-11-09 Thread Andreas Müller
Hi Matt. Did you figure it out yet? Here is an example: https://gist.github.com/1351047 It seems that at the moment, you have to use the whole training set to generate the kernel at test time. Not sure why, maybe for ease of use. Can anyone comment on that? Cheers, Andy ---

Re: [Scikit-learn-general] Caching dataset

2011-11-08 Thread Andreas Mueller
On 11/08/2011 08:35 PM, Andreas Mueller wrote: On 11/08/2011 08:18 PM, Matt Henderson wrote: So if I have a custom kernel function: CustomKernel(x, y) - I should do something like: M = CustomKernel(model.support_vectors_, X) # if X is a matrix where the rows are test vectors results = model.p

Re: [Scikit-learn-general] Caching dataset

2011-11-08 Thread Andreas Mueller
On 11/08/2011 08:18 PM, Matt Henderson wrote: So if I have a custom kernel function: CustomKernel(x, y) - I should do something like: M = CustomKernel(model.support_vectors_, X) # if X is a matrix where the rows are test vectors results = model.predict_proba(M) Actually, I haven't done this

Re: [Scikit-learn-general] Caching dataset

2011-11-08 Thread Matt Henderson
So if I have a custom kernel function: CustomKernel(x, y) - I should do something like: M = CustomKernel(model.support_vectors_, X) # if X is a matrix where the rows are test vectors results = model.predict_proba(M) ? Is there an example of this somewhere? Thanks for your help, Matt On 8 Novem

Re: [Scikit-learn-general] Caching dataset

2011-11-08 Thread Andreas Mueller
Hi Matt. For testing, you have to specify the kernel values between the support vectors and the test vectors. Cheers, Andy On 11/08/2011 06:50 PM, Matt Henderson wrote: Hi, I would just like a quick piece of clarification. Once you have trained an SVM with a precomputed kernel, what is the

[Scikit-learn-general] Caching dataset

2011-11-08 Thread Matt Henderson
Hi, I would just like a quick piece of clarification. Once you have trained an SVM with a precomputed kernel, what is the right way to predict from it? Thanks, Matt -- RSA(R) Conference 2012 Save $700 by Nov 18 Register