Hi Andy,
Yes here is the full code in which I am having a training dataset (x_data)
and an independent test dataset(test_x_data).
Mose importantly, I found few such value in iris data too.
#same Scaling on both test and train data (centering the data scaling)
scaler = preprocessing.StandardScaler()
x_data = scaler.fit_transform(x_data)
test_x_data = scaler.transform(test_x_data)
np.random.seed(0)
indices = np.random.permutation(len(x_data))
X_train = x_data[indices]
y_train = y_data[indices]
np.random.seed(0)
indices = np.random.permutation(len(test_x_data))
X_test = test_x_data[indices]
y_test = test_y_data[indices]
#For Random Forest
clf = RandomForestClassifier(n_estimators=40)
scores = clf.fit(X_train, y_train).score(X_test, y_test)
y_pred = clf.predict(X_test)
print "y_pred", y_pred
y_score = clf.fit(X_train, y_train).predict_proba(X_test)
y_score = np.around(y_score, decimals=2)
accurate = accuracy_score(y_test, y_pred)
prec = precision_score(y_test, y_pred, average='micro')
rec = recall_score(y_test, y_pred, average='micro')
fscore = fbeta_score(y_test, y_pred, average='micro', beta=0.5)
areaRoc = roc_auc_score(y_test, y_score[:,1])
#Generate ROC curve for each cross-validation
fpr, tpr, thresholds = roc_curve(y_test, y_score[:,1], pos_label = 1)
#Pos level for positive class
precision, recall, threshold = precision_recall_curve(y_test,
y_score[:,1])
random_mean_auc_10 = auc(fpr, tpr)
plt.plot([0, 1], [0, 1], '--', color=(0.6, 0.6, 0.6), label='Standard')
plt.plot(fpr, tpr, 'k--',label='RF_ROC_all_data (area = %0.2f)' %
random_mean_auc_10, lw=2, color=(0.45, 0.42, 0.18)) #Plot mean ROC area in
cross validation
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")
pl.savefig(out_folder + "/117_misclassified_RF_FANTOM_
validation_featureSelected_top5_tssDist_H3K27me3_H3K36me3.png",
transparent=True, bbox_inches='tight', pad_inches=0.2)
plt.show()
Please let me know if you need further information.
Thanks!
Shalu
On Wed, Feb 25, 2015 at 9:21 PM, Andy <t3k...@gmail.com> wrote:
> Hi Shalu.
> Can you give your code. The prediction is just the argmax of
> predict_proba, so I'd be very surprised if they are not consistent.
>
> Cheers,
> Andy
>
>
> On 02/25/2015 08:33 AM, shalu jhanwar wrote:
>
> Hi all,
>
> I'm facing the same problem with predict_proba for Random_forest
> classifier. I want to get a confidence value for each class and each
> prediction. But as shown here, that probability values are not consistent
> with prediction always so I was looking for decision_function method for
> random forest, but didn't find.
>
> Can anyone suggest me how can I get decision scores in case of random
> forest?
>
> thanks!
> Shalu
>
> On Thu, Jun 26, 2014 at 10:46 AM, Lars Buitinck <larsm...@gmail.com>
> wrote:
>
>> 2014-06-26 9:15 GMT+02:00 Andy <t3k...@gmail.com>:
>> > Maybe the calibration is not used for prediction? That would be a bit
>> > odd, though...
>>
>> That's exactly what's going on. Prediction is consistent with
>> decision_function, but not predict_proba.
>>
>>
>> ------------------------------------------------------------------------------
>> Open source business process management suite built on Java and Eclipse
>> Turn processes into business applications with Bonita BPM Community
>> Edition
>> Quickly connect people, data, and systems into organized workflows
>> Winner of BOSSIE, CODIE, OW2 and Gartner awards
>> http://p.sf.net/sfu/Bonitasoft
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
>
>
> _______________________________________________
> Scikit-learn-general mailing
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general