Thank you! That helped me a lot!!!
On 5 August 2015 at 11:23, Artem <barmaley....@gmail.com> wrote: > for i in range(len(predicted)): >> auc.append(predicted[i][0]) > > > This is the source of the error. predict_proba returns a matrix (numpy > array, to be precise) of shape (n_samples, n_classes). Obviously, in your > case n_classes = 2. > > A cell at a given row and column is the probability that the sample > corresponding to this row belongs to the class corresponding to this column. > You are considering 0th column only (which per se is not a problem, rows > always sum up to 1), which means that your auc list contains probabilities > of class 0: the higher the probability — the more likely sample to belong > to the class 0. > Now, documentation > <http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html> > says (emphasis mine): > > y_score : array, shape = [n_samples] or [n_samples, n_classes] >> Target scores, can either be probability estimates of the *positive* >> class, confidence values, or binary decisions. > > > class 0 is not considered positive in any way. > > TL;DR > 1. Use column 1 of predict_proba, not 0 > 2. You can just do auc = predicted[:, 1] instead of that loop. Vectorized > operations are way more concise and fast. > > On Wed, Aug 5, 2015 at 11:54 AM, Herbert Schulz <hrbrt....@gmail.com> > wrote: > >> Maybe i didn't explained it very well sorry. >> >> I just have 1 column as a target. The last "post" i did, was just a >> converting from all 0's to 1's and all 1's to 0's. But the auc and the >> expected are from the same date which is converted. So actually it should >> be >> >> auc is [0.9777752710670069, 0.01890450385597026, 0.0059624156214325846, >> 0.05391726570661811] >> expected is [0.0, 1.0, 1.0, 1.0] >> >> here for the auc and the 2-4 values something like 0.97.... and on the >> first value 0.01... >> >> >> >> predicted=clf.predict_proba(X_test) >> predi=[] >> >> classi=[] >> >> >> for i in range(len(predicted)): >> auc.append(predicted[i][0]) >> >> print "auc is",auc >> print "expected is", y_test >> roc= metrics.roc_auc_score(y_test, auc) >> >> print roc >> >> So there should be a failure in my data preprocessing or? >> >> or can i just turn the expected vector? I think that would be a good idea >> if I'm using the normal data. >> >> best >> >> >> >> >> >> >> On 4 August 2015 at 17:38, Andreas Mueller <t3k...@gmail.com> wrote: >> >>> You should select the other column from predict_proba for auc. >>> >>> >>> >>> On 08/04/2015 10:54 AM, Herbert Schulz wrote: >>> >>> Thanks for the answer! >>> >>> hmm its possible, I just make a little example: >>> >>> auc is [0.9777752710670069, 0.01890450385597026, 0.0059624156214325846, >>> 0.05391726570661811] >>> expected is [0.0, 1.0, 1.0, 1.0] >>> but this is already with changed values, in the test set i set every >>> value 0->1 and 1 to 0. >>> >>> SO there is the misstake? it seems that i should "turn" the expected >>> vector y_test ? >>> >>> On 4 August 2015 at 16:36, Artem <barmaley....@gmail.com> wrote: >>> >>>> Hi Herbert >>>> >>>> The worst value for AUC is 0.5 actually. Having values close to 0 means >>>> than you can get a value as close to 1 by just changing your predictions >>>> (predict class 1 when you think it's 0 and vice versa). Are you sure you >>>> didn't confuse classes somewhere along the lines? (You might have chosen >>>> the wrong column from predict_proba's result, for example) >>>> >>>> On Tue, Aug 4, 2015 at 4:51 PM, Herbert Schulz <hrbrt....@gmail.com> >>>> wrote: >>>> >>>>> Hey, >>>>> >>>>> I'm computing the AUC for some data... >>>>> >>>>> >>>>> The classification target is 1 or 0. And i have a lot of 0's ( 5600) >>>>> and just 700 1's as a target. >>>>> >>>>> My AUC is about 0.097... >>>>> >>>>> where y_test are a vector containing 1's and 0's and auc is containg >>>>> the predict_proba values >>>>> >>>>> roc= metrics.roc_auc_score(y_test, auc). >>>>> >>>>> >>>>> Actually this value seems way to bad, because my ballance accuracy is >>>>> about 0.77... i thought that I'm Doing maybe something wrong. >>>>> >>>>> >>>>> report: >>>>> >>>>> precision recall f1-score support >>>>> >>>>> 0.0 0.95 0.91 0.93 537 >>>>> 1.0 0.49 0.63 0.55 73 >>>>> >>>>> avg / total 0.89 0.88 0.88 610 >>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> Scikit-learn-general mailing list >>>>> Scikit-learn-general@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>>> >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> Scikit-learn-general@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> _______________________________________________ >>> Scikit-learn-general mailing >>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general