Re: [Scikit-learn-general] AUC realy low

Herbert Schulz Wed, 05 Aug 2015 02:48:08 -0700

Thank you! That helped me a lot!!!


On 5 August 2015 at 11:23, Artem <barmaley....@gmail.com> wrote:

>  for i in range(len(predicted)):
>>             auc.append(predicted[i][0])
>
>
> This is the source of the error. predict_proba returns a matrix (numpy
> array, to be precise) of shape (n_samples, n_classes). Obviously, in your
> case n_classes = 2.
>
> A cell at a given row and column is the probability that the sample
> corresponding to this row belongs to the class corresponding to this column.
> You are considering 0th column only (which per se is not a problem, rows
> always sum up to 1), which means that your auc list contains probabilities
> of class 0: the higher the probability — the more likely sample to belong
> to the class 0.
> Now, documentation
> <http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html>
> says (emphasis mine):
>
> y_score : array, shape = [n_samples] or [n_samples, n_classes]
>> Target scores, can either be probability estimates of the *positive*
>> class, confidence values, or binary decisions.
>
>
> class 0 is not considered positive in any way.
>
> TL;DR
> 1. Use column 1 of predict_proba, not 0
> 2. You can just do auc = predicted[:, 1] instead of that loop. Vectorized
> operations are way more concise and fast.
>
> On Wed, Aug 5, 2015 at 11:54 AM, Herbert Schulz <hrbrt....@gmail.com>
> wrote:
>
>> Maybe i didn't explained it very well sorry.
>>
>> I just have 1 column as a target. The last "post" i did, was just a
>> converting from all 0's to 1's and all 1's to 0's. But the auc and the
>> expected are from the same date which is converted. So actually it should
>> be
>>
>> auc is [0.9777752710670069, 0.01890450385597026, 0.0059624156214325846,
>> 0.05391726570661811]
>> expected is [0.0, 1.0, 1.0, 1.0]
>>
>> here for the auc and the 2-4 values something like 0.97....  and on the
>> first value 0.01...
>>
>>
>>
>>         predicted=clf.predict_proba(X_test)
>>         predi=[]
>>
>>         classi=[]
>>
>>
>>         for i in range(len(predicted)):
>>             auc.append(predicted[i][0])
>>
>>         print "auc is",auc
>>         print "expected is", y_test
>>         roc= metrics.roc_auc_score(y_test, auc)
>>
>>         print roc
>>
>> So there should be a failure in my data preprocessing or?
>>
>> or can i just turn the expected vector? I think that would be a good idea
>> if I'm using the normal data.
>>
>> best
>>
>>
>>
>>
>>
>>
>> On 4 August 2015 at 17:38, Andreas Mueller <t3k...@gmail.com> wrote:
>>
>>> You should select the other column from predict_proba for auc.
>>>
>>>
>>>
>>> On 08/04/2015 10:54 AM, Herbert Schulz wrote:
>>>
>>> Thanks for the answer!
>>>
>>> hmm its possible, I just make a little example:
>>>
>>> auc is [0.9777752710670069, 0.01890450385597026, 0.0059624156214325846,
>>> 0.05391726570661811]
>>> expected is [0.0, 1.0, 1.0, 1.0]
>>>  but this is already with changed values, in the test set i set every
>>> value 0->1  and 1 to 0.
>>>
>>> SO there is the misstake? it seems that i should "turn" the expected
>>> vector y_test ?
>>>
>>> On 4 August 2015 at 16:36, Artem <barmaley....@gmail.com> wrote:
>>>
>>>> Hi Herbert
>>>>
>>>> The worst value for AUC is 0.5 actually. Having values close to 0 means
>>>> than you can get a value as close to 1 by just changing your predictions
>>>> (predict class 1 when you think it's 0 and vice versa). Are you sure you
>>>> didn't confuse classes somewhere along the lines? (You might have chosen
>>>> the wrong column from predict_proba's result, for example)
>>>>
>>>> On Tue, Aug 4, 2015 at 4:51 PM, Herbert Schulz <hrbrt....@gmail.com>
>>>> wrote:
>>>>
>>>>> Hey,
>>>>>
>>>>> I'm computing the AUC for some data...
>>>>>
>>>>>
>>>>> The classification target is 1 or 0. And i have a lot of 0's ( 5600)
>>>>> and just 700 1's as a target.
>>>>>
>>>>> My AUC is about 0.097...
>>>>>
>>>>> where y_test are a vector containing 1's and 0's  and auc is containg
>>>>> the predict_proba values
>>>>>
>>>>>  roc= metrics.roc_auc_score(y_test, auc).
>>>>>
>>>>>
>>>>> Actually this value seems way to bad, because my ballance accuracy is
>>>>> about 0.77... i thought that I'm Doing maybe something wrong.
>>>>>
>>>>>
>>>>> report:
>>>>>
>>>>>              precision    recall  f1-score   support
>>>>>
>>>>>         0.0       0.95      0.91      0.93       537
>>>>>         1.0       0.49      0.63      0.55        73
>>>>>
>>>>> avg / total       0.89      0.88      0.88       610
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing 
>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] AUC realy low

Reply via email to