Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread abdalrahman eweiwi
Hi, Just noticed that I coincidentally sent half of my email!! Anyway, here is my question again: A couple of weeks ago I test the intersection kernel in my experiments, for that I wrote the following simple procedures: import numpy as np import cv2 def intersection_dist(v1,v2): return

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Olivier Grisel
2012/9/20 abdalrahman eweiwi : > Hi, > > Just noticed that I coincidentally sent half of my email!! Anyway, here is > my question again: > > A couple of weeks ago I test the intersection kernel in my experiments, for > that I wrote the following simple procedures: > > import numpy as np > import c

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread abdalrahman eweiwi
On Thu, Sep 20, 2012 at 10:54 AM, Olivier Grisel wrote: > 2012/9/20 abdalrahman eweiwi : > > Hi, > > > > Just noticed that I coincidentally sent half of my email!! Anyway, here > is > > my question again: > > > > A couple of weeks ago I test the intersection kernel in my experiments, > for > > th

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Olivier Grisel
You can also debug by printing the kernel value of pairs of samples that you know are qualitatively related and pairs that you know are qualitatively unrelated. If you kernel is positive (it should be), pairs of related samples should always have a higher value than pairs of unrelated samples. --

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Andreas Mueller
* * *return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum(*) I'm not sure what the None does here, I would write np.minimum(v, v2).sum() which is the correct intersection kernel I think. What are the features you use? The features have to be positive for this to work. There should be

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Gael Varoquaux
On Thu, Sep 20, 2012 at 12:58:56PM +0100, Andreas Mueller wrote: > return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum() >I'm not sure what the None does here, I would write None create a new axis. For readability, one should always write v[np.newaxis, :], which is equivalent.

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Andreas Mueller
On 09/20/2012 01:01 PM, Gael Varoquaux wrote: > On Thu, Sep 20, 2012 at 12:58:56PM +0100, Andreas Mueller wrote: >> return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum() >> I'm not sure what the None does here, I would write > None create a new axis. For readability, one should al

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Andreas Mueller
On 09/20/2012 01:01 PM, Gael Varoquaux wrote: > On Thu, Sep 20, 2012 at 12:58:56PM +0100, Andreas Mueller wrote: >> return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum() >> I'm not sure what the None does here, I would write > None create a new axis. For readability, one should al

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread abdalrahman eweiwi
On Thu, Sep 20, 2012 at 2:08 PM, Andreas Mueller wrote: > On 09/20/2012 01:01 PM, Gael Varoquaux wrote: > > On Thu, Sep 20, 2012 at 12:58:56PM +0100, Andreas Mueller wrote: > >> return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum() > >> I'm not sure what the None does here, I wou

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread abdalrahman eweiwi
On Thu, Sep 20, 2012 at 1:58 PM, Andreas Mueller wrote: > > * > * > *return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum(*) > > I'm not sure what the None does here, I would write > np.minimum(v, v2).sum() > which is the correct intersection kernel I think. > I know, sometimes i make thi

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Olivier Grisel
2012/9/20 abdalrahman eweiwi : > > > On Thu, Sep 20, 2012 at 1:58 PM, Andreas Mueller > wrote: >> >> >> >> return np.min(np.vstack((v[None,:],v2[None,:])),axis=0).sum() >> >> I'm not sure what the None does here, I would write >> np.minimum(v, v2).sum() >> which is the correct intersection kernel

Re: [Scikit-learn-general] Intersection kernel over-fitting problem

2012-09-20 Thread Andreas Mueller
>>> There should be no other parameters to tune here. >>> >>> Btw, you can try kernel_approximation.AdditiveChi2Sampler (if your data is >>> positive). > Note that this is not meant to be used as an exact kernel in a kernel > SVM but as a explicit preprocessing transformer for a linear kernel. > T

[Scikit-learn-general] Thanks!

2012-09-20 Thread Vivek Sharma
Hello, I was asking Olivier about CRF in sklearn and I ended up discussing some of my Kaggle experience. I am forwarding my email to this list on his suggestion. Thanks to the sklearn team (especially the text classification module authors) for helping me win a competition at Kaggle (it was a pri

Re: [Scikit-learn-general] Thanks!

2012-09-20 Thread Andreas Mueller
Hey Vivek. Thanks for sharing your success with sklearn. Out of curiosity: CRF learning for loopy graphs or non-loopy graphs or chains? On 09/20/2012 05:17 PM, Vivek Sharma wrote: > > > Features that I wish were available: > - openmp/multiprocessing mode for libsvm > (http://www.csie.ntu.edu.t

Re: [Scikit-learn-general] Thanks!

2012-09-20 Thread Vivek Sharma
Hi Andreas, Thanks for your comments. For CRF, linear chain... mainly for NLP tasks.. I will probably use the Stanford implementation, but I haven't tried it yet. I agree that sparse RF/GBM may not be all that useful, because if the problem is too sparse, its probably better to use something els

Re: [Scikit-learn-general] Thanks!

2012-09-20 Thread Peter Prettenhofer
2012/9/20 Vivek Sharma : > Hi Andreas, > > Thanks for your comments. > > For CRF, linear chain... mainly for NLP tasks.. I will probably use the > Stanford implementation, but I haven't tried it yet. > > I agree that sparse RF/GBM may not be all that useful, because if the > problem is too sparse,

Re: [Scikit-learn-general] Thanks!

2012-09-20 Thread Vivek Sharma
Hah, I saw the same paper and when I saw it was an option in sklearn, I thought it must be a good idea. I couldn't make it work though - it worked under some condition, but not in others. I don't remember the exact issue. On Thu, Sep 20, 2012 at 2:05 PM, Peter Prettenhofer < peter.prettenho...@gm

[Scikit-learn-general] Fwd: Build failed in Jenkins (bug in test_common.py?)

2012-09-20 Thread Lars Buitinck
Below are some excerpts from the "Build failed" message that I got after git rm'ing the sparse linear models code. The strange thing is that it seems to start rebuilding in the middle of the tests. The same thing happened when I tried nosetests sklearn/tests/ on my own box, which also produced a lo

Re: [Scikit-learn-general] Pycon 2013 tutorial

2012-09-20 Thread Fernando Perez
On Mon, Sep 17, 2012 at 9:05 AM, Jacob VanderPlas wrote: > Our plan is to propose two tutorials: I'll submit an intro to machine > learning with scikit-learn, based on the material I covered at Scipy > 2012 (http://astroML.github.com/sklearn_tutorial). Olivier will submit > a more advanced tutori

Re: [Scikit-learn-general] Pycon 2013 tutorial

2012-09-20 Thread Fernando Perez
ps - to match things, we'll mention in ours that it would be good to schedule it as early as possible, so that any other tutorials on sklearn, pandas, numpy, etc, can come later and attendees who take ours can benefit from the combination. The pycon guys actually care about that stuff, so I'm sure

Re: [Scikit-learn-general] Thanks!

2012-09-20 Thread Gael Varoquaux
Hi, On Thu, Sep 20, 2012 at 12:17:07PM -0400, Vivek Sharma wrote: > Thanks to the sklearn team (especially the text classification module > authors) for helping me win a competition at Kaggle (it was a private > competition, organized in April). Well, thanks for saying thanks! :) It is a natural