Thanks, that's very helpful!
On 12 September 2012 11:47, Peter Prettenhofer
wrote:
> 2012/9/12 Peter Prettenhofer :
>> [..]
>>
>> AFAIK Fabian has some scikit-learn code for that as well.
>
> here is the code https://gist.github.com/2071994
>
>
> --
> Peter Prettenhofer
>
> -
2012/9/12 Peter Prettenhofer :
> [..]
>
> AFAIK Fabian has some scikit-learn code for that as well.
here is the code https://gist.github.com/2071994
--
Peter Prettenhofer
--
Live Security Virtual Conference
Exclusive l
If the class imbalance is pretty severe and you have a binary
classification problem you might consider using a ranking svm (see
[1]).
AFAIK Fabian has some scikit-learn code for that as well.
[1] http://www.eecs.tufts.edu/~dsculley/papers/adversarial-ads.pdf
2012/9/12 Christian Jauvin :
>> May
2012/9/12 Christian Jauvin :
>> May I ask why you think you need this?
>
> It was my naive assumption of how to tackle class imbalance with an
> SGD classifier, but as Olivier already suggested, using class_weight
> makes more sense for this. Is there another mechanism or strategy that
> I should b
> May I ask why you think you need this?
It was my naive assumption of how to tackle class imbalance with an
SGD classifier, but as Olivier already suggested, using class_weight
makes more sense for this. Is there another mechanism or strategy that
I should be aware of you think?
2012/9/12 Christian Jauvin :
> As I only have an intuitive notion of how "sample_weight" (i.e. to be
> fed to certain types of classifier) should work, I'd like to know if
> this is a sound way of computing them:
>
> def get_sample_weight(y):
> p = 1. / len(np.unique(y))
> bc = np.bincount(
2012/9/12 Christian Jauvin :
> Hi,
>
> As I only have an intuitive notion of how "sample_weight" (i.e. to be
> fed to certain types of classifier) should work, I'd like to know if
> this is a sound way of computing them:
>
> def get_sample_weight(y):
> p = 1. / len(np.unique(y))
> bc = np.b
Hi,
As I only have an intuitive notion of how "sample_weight" (i.e. to be
fed to certain types of classifier) should work, I'd like to know if
this is a sound way of computing them:
def get_sample_weight(y):
p = 1. / len(np.unique(y))
bc = np.bincount(y)
w = np.repeat(p, len(y))
f