Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Christian Jauvin
Thanks, that's very helpful! On 12 September 2012 11:47, Peter Prettenhofer wrote: > 2012/9/12 Peter Prettenhofer : >> [..] >> >> AFAIK Fabian has some scikit-learn code for that as well. > > here is the code https://gist.github.com/2071994 > > > -- > Peter Prettenhofer > > -

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Peter Prettenhofer
2012/9/12 Peter Prettenhofer : > [..] > > AFAIK Fabian has some scikit-learn code for that as well. here is the code https://gist.github.com/2071994 -- Peter Prettenhofer -- Live Security Virtual Conference Exclusive l

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Peter Prettenhofer
If the class imbalance is pretty severe and you have a binary classification problem you might consider using a ranking svm (see [1]). AFAIK Fabian has some scikit-learn code for that as well. [1] http://www.eecs.tufts.edu/~dsculley/papers/adversarial-ads.pdf 2012/9/12 Christian Jauvin : >> May

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Olivier Grisel
2012/9/12 Christian Jauvin : >> May I ask why you think you need this? > > It was my naive assumption of how to tackle class imbalance with an > SGD classifier, but as Olivier already suggested, using class_weight > makes more sense for this. Is there another mechanism or strategy that > I should b

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Christian Jauvin
> May I ask why you think you need this? It was my naive assumption of how to tackle class imbalance with an SGD classifier, but as Olivier already suggested, using class_weight makes more sense for this. Is there another mechanism or strategy that I should be aware of you think?

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Lars Buitinck
2012/9/12 Christian Jauvin : > As I only have an intuitive notion of how "sample_weight" (i.e. to be > fed to certain types of classifier) should work, I'd like to know if > this is a sound way of computing them: > > def get_sample_weight(y): > p = 1. / len(np.unique(y)) > bc = np.bincount(

Re: [Scikit-learn-general] computing the sample weights

2012-09-12 Thread Olivier Grisel
2012/9/12 Christian Jauvin : > Hi, > > As I only have an intuitive notion of how "sample_weight" (i.e. to be > fed to certain types of classifier) should work, I'd like to know if > this is a sound way of computing them: > > def get_sample_weight(y): > p = 1. / len(np.unique(y)) > bc = np.b

[Scikit-learn-general] computing the sample weights

2012-09-12 Thread Christian Jauvin
Hi, As I only have an intuitive notion of how "sample_weight" (i.e. to be fed to certain types of classifier) should work, I'd like to know if this is a sound way of computing them: def get_sample_weight(y): p = 1. / len(np.unique(y)) bc = np.bincount(y) w = np.repeat(p, len(y)) f