The class_weight parameter doesn't behave the way you're expecting.

The value in class_weight is the weight applied to each sample in that
class - in your example, each class zero sample has weight 0.001 and each
class one sample has weight 0.999, so each class one samples carries 999
times the weight of a class zero sample.

If you would like each class one sample to have ten times the weight, you
would set `class_weight={0: 1, 1: 10}` or `class_weight={0:0.1, 1:1}`
equivalently.


On Sat, Jan 21, 2017 at 10:18 AM, Debabrata Ghosh <[email protected]>
wrote:

> Hi All,
>              Greetings !
>
>               I have a very basic question regarding the usage of the
> parameter class_weight in scikit learn's Random Forest Classifier's fit
> method.
>
>               I have a fairly unbalanced sample and my positive class :
> negative class ratio is 1:100. In other words, I have a million records
> corresponding to negative class and 10,000 records corresponding to
> positive class. I have trained the random forest classifier model using the
> above record set successfully.
>
>               Further, for a different problem, I want to test the
> parameter class_weight. So, I am setting the class_weight as [0:0.001 ,
> 1:0.999] and I have tried running my model on the same dataset as mentioned
> in the above paragraph but with the positive class records reduced to 1000
> [because now each positive class is given approximately 10 times more
> weight than a negative class]. However, the model run results are very very
> different between the 2 runs (with and without class_weight). And I
> expected a similar run results.
>
>                 Would you please be able to let me know where am I getting
> wrong. I know it's something silly but just want to improve on my concept.
>
> Thanks !
>
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to