Hi Raghav,
Thanks for your reply. My Class 1 is smaller in size than class 0. So, even if
the ‘sample_weights' introduce any bias ,
it should favour the class 1.
Regardless of the size bias of class 1 and class 0, I want to give more
importance to class 1 while splitting at tree
node using Gini impurity. The ‘class_weight’ parameter should have take care of
that. However, I do not see any
effect.
I will try specifying the ‘class_weight’ without specifying sample weight to
see if there is any change in the situation.
Thanks,
Mamun
> Hi Mamun,
>
> Scikit-learn's RandomForestClassifier has an option to set `class_weight`
> to "balanced". Have you tried that alone without specifying
> `sample_weights`?
>
> See this documentation -
> http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier
>
> <http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier>
>
>
> Is there a chance that what you try to achieve by `class_weights` is being
> undone by your `sample_weights`?
>
>
> Thanks.
> R
>
> On Tue, Mar 15, 2016 at 12:44 PM, Mamun Rashid <mamunbabu2...@gmail.com
> <mailto:mamunbabu2...@gmail.com>>
> wrote:
>
>> Hi All,
>> I have asked this question couple of weeks ago on the list. I have a two
>> class problem where my positive class ( Class 1 ) and negative class (
>> Class 0 )
>> is imbalanced. Secondly I care much less about the negative class. So, I
>> specified both class weight (to a random forest classifier) and sample
>> wright to
>> the fit function to give more importance to my positive class.
>>
>> cl_weight = {0:weight1, 1:weight2}
>>
>> clf = RandomForestClassifier(n_estimators=400, max_depth=None,
>> min_samples_split=2, random_state=0, oob_score=True, class_weight =
>> cl_weight, criterion=*?g**ini*")
>>
>> sample_weight = np.array([weight if m == 1 else 1 for m in
>> df_tr[label_column]])
>>
>> y_pred = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)
>>
>>
>> Despite specifying dramatically different class weight I do not observe much
>> difference.
>> Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}.
>> Am I passing the class weight correctly ?
>>
>>
>> I am giving example of two folds from these two runs :: Fold 1 and Fold 2.
>>
>> ## cl_weight = {0:0.001, 1:0.999}
>> Fold_1 Confusion Matrix 0 1 0 1681 26 1 636 149 Fold_5 Confusion Matrix 0
>> 1 0 1670 15 1 734 160 ## cl_weight = {0:0.50, 1:0.50}
>> Fold_1 Confusion Matrix 0 1 0 1690 15 1 630 163 Fold_5 Confusion Matrix 0
>> 1 0 1676 14 1 709 170
>>
>>
>> Thanks,
>> Mamun
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general