Hi.

Unfortunately we don't have an implementation of a cost matrix in sklearn directly, but you can change the threshold of the model prediction,
by using something like y_pred = tree.predict_proba(X_test)[:, 1] > 0.6

What trade-off of precision and recall do you want? Have you looked at the precision_recall_curve?


Andy

On 03/15/2018 09:28 PM, Nadim Farhat wrote:
Dear All,

I have a *screening* lab test and I am trying to minimize the False negative value in the recall (TP/(TP+FN)) therefore I want to increase the cost whenever an FN is found in the training. I understand that in R they have some kind of loss matrix that penalize the FN during fitting.  my Postive classes percentage is 30 % On the forums and StackOverflow, they suggest using class_weight=balanced in the decision tree which oversamples the class with the lowest frequency. However, I don't see how that helps in minimizing the FN.

Any suggestions?


Bests

Nadim









--
Nadim Farhat


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to