Hi all,
I tested the following code and its outputs show predict_proba and predict
give very different result, even for the samples with high probability
(0.7) to be label 1 are predicted as label 1. I am very surprised. Is this
problem specific to the algorithm SVC used to generate probability? I
haven't tested on other types of models. would they have similar problem?
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.cross_validation import train_test_split
from sklearn.metrics import confusion_matrix
# import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the data into a training set and a test set
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
# Run classifier, using a model that is too regularized (C too low) to see
# the impact on the results
classifier = svm.SVC(kernel='linear', C=0.01, probability=True)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
y_pred_p = classifier.predict_proba(X_test)
y_pred_p_l = np.argmax(y_pred_p, axis=1)
diff = np.argwhere(y_pred != np.argmax(y_pred_prob, axis=1)).ravel()
y_pred[diff]
y_pred_p_l[diff]
y_pred_p[diff,]
Here are the output:
$ y_pred[diff]
: array([2, 2, 2, 2, 2, 2, 2, 2, 2])
$ y_pred_p_l[diff]
: array([1, 1, 1, 1, 1, 1, 1, 1, 1])
$ y_pred_p[diff,]
:
array([[ 0.01, 0.59, 0.4 ],
[ 0.01, 0.57, 0.41],
[ 0.02, 0.7 , 0.28],
[ 0.01, 0.67, 0.31],
[ 0.01, 0.72, 0.27],
[ 0.01, 0.61, 0.38],
[ 0.01, 0.59, 0.4 ],
[ 0.01, 0.56, 0.43],
[ 0.01, 0.56, 0.43]])
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general