2012/9/12 Christian Jauvin <[email protected]>: >> May I ask why you think you need this? > > It was my naive assumption of how to tackle class imbalance with an > SGD classifier, but as Olivier already suggested, using class_weight > makes more sense for this. Is there another mechanism or strategy that > I should be aware of you think?
For SGD you can sub-sample the over-represented classes (you further get a speed benefit by doing so as you don't use all the data). It would be great to have that option builtin in SGD models. You can also oversample the under-represented classes (but without the speed benefit of undersampling). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
