Here is a very small example using precision_recall_curve():

from sklearn.metrics import precision_recall_curve, precision_score,
recall_score
y_true = [0, 1]
y_predict_proba = [0.25,0.75]
precision, recall, thresholds = precision_recall_curve(y_true, y_predict_proba)
precision, recall

which results in:

(array([1., 1.]), array([1., 0.]))

Now let's calculate manually to see whether that's correct.  There are
three possible class vectors depending on threshold: [0,0], [0,1], and
[1,1]. We have to discard [0,0] because it gives an undefined precision
(divide by zero). So, applying precision_score() and recall_score() to the
other two:

y_predict_class=[0,1]
precision_score(y_true, y_predict_class), recall_score(y_true, y_predict_class)

which gives:

(1.0, 1.0)

and

y_predict_class=[1,1]
precision_score(y_true, y_predict_class), recall_score(y_true, y_predict_class)

which gives

(0.5, 1.0)

This seems not to match the output of precision_recall_curve() (which for
example did not produce a 0.5 precision value).

Am I missing something?
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to