Re: [scikit-learn] Scikit Learn Random Classifier - TPR and FPR plotted on matplotlib

Jacob Schreiber Wed, 14 Dec 2016 10:51:03 -0800

To make a proper ROC curve you need to test all possible thresholds, not
just a subset of them. You can do this easily in sklearn.


import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve, roc_auc_score

... <load your data, fit your classifier> ...

y_pred = clf.predict_proba(X)
fpr, tpr, _ = roc_curve(y_true, y_pred)
auc = roc_auc_score(y_true, y_pred)
plt.plot(fpr, tpr, label=auc)

On Wed, Dec 14, 2016 at 8:52 AM, Stuart Reynolds <[email protected]>
wrote:

> You're looking at a tiny subset of the possible cutoff thresholds for this
> classifier.
> Lower thresholds will give higher tot at the expense of tpr.
> Usually, AUC is computed at the integral of this graph over the whole
> range of FPRs (from zero to one).
>
> If you have your classifier output probabilities or activations, the
> maximum and minimum of these values will tell you what the largest and
> smallest thresholds should be. Scikit also has a function to directly
> receive the activations and true classes and compute the AUC and tpr/fpr
> curve.
>
> On Wed, Dec 14, 2016 at 5:12 AM Dale T Smith <[email protected]>
> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> I think you need to look at the examples.
>>
>>
>>
>>
>>
>>
>>
>>
>> ____________________________________________________________
>> ____________________________________________________________
>> __________________
>>
>>
>> *Dale T. Smith*
>>
>> *|* Macy's Systems and Technology
>>
>> *|* IFS eCom CSE Data Science
>>
>>
>>
>>
>> 5985 State Bridge Road, Johns Creek, GA 30097 *|* [email protected]
>>
>>
>>
>>
>>
>> *From:* scikit-learn [mailto:scikit-learn-bounces+dale.t.smith=
>> [email protected]]
>>
>> *On Behalf Of *Debabrata Ghosh
>>
>>
>> *Sent:* Wednesday, December 14, 2016 3:13 AM
>>
>>
>> *To:* Scikit-learn user and developer mailing list
>>
>>
>> *Subject:* [scikit-learn] Scikit Learn Random Classifier - TPR and FPR
>> plotted on matplotlib
>>
>>
>>
>>
>>
>>
>>
>> ⚠ EXT MSG:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi All,
>>
>>
>>
>>
>>                       I have run scikit-learn Random Forest Classifier
>> algorithm against a dataset and here is my TPR and FPR against various
>> thresholds:
>>
>>
>>
>>
>>
>> [image: Inline image 1]
>>
>>
>>
>>
>> Further I have plotted the above values in matplotlib and am getting a
>> very low AUC. Here is my matplotlib code. Can I understand the
>> interpretation of the graph from you please.Is my model Ok or is there
>> something wrong ? Appreciate for
>>
>> a quick response please.
>>
>>
>>
>>
>>
>> import matplotlib.pyplot as plt
>>
>>
>> import numpy as np
>>
>>
>> from sklearn import metrics
>>
>>
>> plt.title('Receiver Operating Characteristic')
>>
>>
>> plt.ylabel('True Positive Rate')
>>
>>
>> plt.xlabel('False Positive Rate')
>>
>>
>> fpr = [0.0002337345394340,0.0001924870472260,0.0001626973851550,0.
>> 0000950977673794,
>>
>>
>>        0.0000721826427097,0.0000538505429739,0.0000389557119386,0.
>> 0000263523933702,
>>
>>
>>        0.0000137490748018]
>>
>>
>>
>>
>>
>> tpr = [0.19673638244100000000,0.18984141576600000000,0.
>> 18122270742400000000,
>>
>>
>>        0.17055510860800000000,0.16434892541100000000,0.
>> 15789473684200000000,
>>
>>
>>        0.15134451850100000000,0.14410480349300000000,0.
>> 13238336014700000000]
>>
>>
>>
>>
>>
>> roc_auc = metrics.auc(fpr, tpr)
>>
>>
>>
>>
>>
>> plt.plot([0, 1], [0, 1],'r--')
>>
>>
>> plt.plot(fpr, tpr, 'bo-', label = 'AUC = %0.9f' % roc_auc)
>>
>>
>> plt.legend(loc = 'lower right')
>>
>>
>>
>>
>>
>> plt.show()
>>
>>
>>
>>
>>
>> [image: Inline image 2]
>>
>>
>>
>>
>>
>>
>> * This is an EXTERNAL EMAIL. Stop and think before clicking a link or
>> opening attachments.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>>
>> scikit-learn mailing list
>>
>> [email protected]
>>
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Scikit Learn Random Classifier - TPR and FPR plotted on matplotlib

Reply via email to