Dear ML experts,

I'm solving a multi-class classification problem. Following the great
advice of Peter Prettenhofer and Gilles Louppe (
http://www.slideshare.net/DataRobot/gradient-boosted-regression-trees-in-scikitlearn)
about the use of the GradientBoostingClassifier (after tuning the
iperparam. with an extensive GridSearch CV), I reached an unexpected 95.5%
f1-score.
I then plot the ROC curve and ROC_AUC following the example in
http://scikit-learn.org/stable/auto_examples/plot_roc.htmlwhere I
"adjusted" the original multi-class y_test vector using the label_binarize
as follows:

y_test = label_binarize(y_test, classes=[0, 1, 2, 3])

What surprised me is that the ROC_AUC is so-close/equal to 1.00 that looks
suspicious
(see attached figure).

My question is: Is it correct to proceed to the ROC/ROC_AUC plot as I did?

Thanks in advance ....

Cheers,
Eraldo

The full code I used is as follows:

clf = GradientBoostingClassifier(n_estimators=400)
clf.fit(X_train,y_train)
clf.score(X_test,y_test)

y_score = clf.decision_function(X_test)
y_test = label_binarize(y_test, classes=[0, 1, 2, 3])

# Compute ROC curve and ROC area for each class
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])

# Compute micro-average ROC curve and ROC area
fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])

# Plot ROC curve
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
         label='micro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["micro"]))
for i in range(n_classes):
    plt.plot(fpr[i], tpr[i], label='ROC curve of class "{0}" (area =
{1:0.2f})'
                                   ''.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic (multi-class)')
plt.legend(loc="lower right");

plt.savefig('gbc_roc_curve.png')
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to