Hi Mamun,

Scikit-learn's RandomForestClassifier has an option to set `class_weight`
to "balanced". Have you tried that alone without specifying
`sample_weights`?

See this documentation -
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier


Is there a chance that what you try to achieve by `class_weights` is being
undone by your `sample_weights`?


Thanks.
R

On Tue, Mar 15, 2016 at 12:44 PM, Mamun Rashid <mamunbabu2...@gmail.com>
wrote:

> Hi All,
> I have asked this question couple of weeks ago on the list. I have a two
> class problem where my positive class ( Class 1 ) and negative class (
> Class 0 )
> is imbalanced. Secondly I care much less about the negative class. So, I
> specified both class weight (to a random forest classifier) and sample
> wright to
> the fit function to give more importance to my positive class.
>
> cl_weight = {0:weight1, 1:weight2}
>
> clf = RandomForestClassifier(n_estimators=400, max_depth=None, 
> min_samples_split=2, random_state=0, oob_score=True, class_weight = 
> cl_weight, criterion=*“g**ini*")
>
> sample_weight = np.array([weight if m == 1 else 1 for m in 
> df_tr[label_column]])
>
> y_pred = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)
>
>
> Despite specifying dramatically different class weight I do not observe much 
> difference.
> Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}.
> Am I passing the class weight correctly ?
>
>
> I am giving example of two folds from these two runs :: Fold 1 and Fold 2.
>
> ## cl_weight = {0:0.001, 1:0.999}
> Fold_1 Confusion Matrix 0 1 0 1681 26 1 636 149 Fold_5 Confusion Matrix 0
> 1 0 1670 15 1 734 160 ## cl_weight = {0:0.50, 1:0.50}
> Fold_1 Confusion Matrix 0 1 0 1690 15 1 630 163 Fold_5 Confusion Matrix 0
> 1 0 1676 14 1 709 170
>
>
> Thanks,
> Mamun
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to