Hi All,
I have asked this question couple of weeks ago on the list. I have a two class 
problem where my positive class ( Class 1 ) and negative class ( Class 0 ) 
is imbalanced. Secondly I care much less about the negative class. So, I 
specified both class weight (to a random forest classifier) and sample wright 
to 
the fit function to give more importance to my positive class. 

cl_weight = {0:weight1, 1:weight2}
clf = RandomForestClassifier(n_estimators=400, max_depth=None, 
min_samples_split=2, random_state=0, oob_score=True, class_weight = cl_weight, 
criterion=“gini")
sample_weight = np.array([weight if m == 1 else 1 for m in df_tr[label_column]])
y_pred = clf.fit(X_tr, y_tr,sample_weight= sample_weight).predict(X_te)

Despite specifying dramatically different class weight I do not observe much 
difference. 
Example :: cl_weight = {0:0.001, 1:0.999} and cl_weight = {0:0.50, 1:0.50}. 
Am I passing the class weight correctly ?

I am giving example of two folds from these two runs :: Fold 1 and Fold 2. 

## cl_weight = {0:0.001, 1:0.999}

Fold_1 Confusion Matrix
     0   1
0  1681  26
1  636  149

Fold_5 Confusion Matrix
    0   1
0  1670 15
1  734  160

## cl_weight = {0:0.50, 1:0.50}

Fold_1 Confusion Matrix
    0   1
0  1690 15
1  630  163

Fold_5 Confusion Matrix
     0  1
0  1676 14
1  709  170


Thanks,
Mamun
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to