Hi.
Unfortunately we don't have an implementation of a cost matrix in
sklearn directly, but you can change the threshold of the model prediction,
by using something like y_pred = tree.predict_proba(X_test)[:, 1] > 0.6
What trade-off of precision and recall do you want? Have you looked at
the precision_recall_curve?
Andy
On 03/15/2018 09:28 PM, Nadim Farhat wrote:
Dear All,
I have a *screening* lab test and I am trying to minimize the False
negative value in the recall (TP/(TP+FN)) therefore I want to increase
the cost whenever an FN is found in the training. I understand that in
R they have some kind of loss matrix that penalize the FN during
fitting. my Postive classes percentage is 30 %
On the forums and StackOverflow, they suggest using
class_weight=balanced in the decision tree which oversamples the class
with the lowest frequency. However, I don't see how that helps in
minimizing the FN.
Any suggestions?
Bests
Nadim
--
Nadim Farhat
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn