I will first assume that RandomOverSampling refer to imbalanced-learn API (a scikit-learn-contrib project).
The parameter that you are seeking for is the ratio parameter. By default ratio='auto' which will balance the classes, as you described. The ratio can be given as a float as the ratio of the number of samples in the minority class over the number of samples in in the majority class. Check there for more info: http://contrib.scikit-learn.org/imbalanced-learn/generated/imblearn.over_sampling.RandomOverSampler.html#imblearn.over_sampling.RandomOverSampler On 10 January 2017 at 18:36, Suranga Kasthurirathne <suranga...@gmail.com> wrote: > > Hi all, > > I apologize - i've been looking for this answer all over the internet, and > it could be that I'm not googling the right terms. > > For managing unbalanced datasets, Weka has SMOTE, and scikit has > randomoversampling. > > In weka, we can ask it to boost by a given percentage (say 100%) so an > undersampled class with 10 values ends up with 20 values (100% increase) > after boosting. > > In Scikit learn, I cant seem to find a way to do this. The > ramdomoversampler boosts arbitrarily. and seem to try to balance the two > classes, which may not be realistic in some cases. > > Can anyone point me to how I can manage boosting percentage using scikit? > > -- > Best Regards, > Suranga > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Guillaume Lemaitre INRIA Saclay - Ile-de-France Equipe PARIETAL guillaume.lemaitre@inria.f <guillaume.lemai...@inria.fr>r --- https://glemaitre.github.io/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn