What is the outcome of this discussion for scikit-learn?
- Would someone be interested in improving the documentation
<http://scikit-learn.org/dev/modules/model_evaluation.html#classification-metrics>
highlight
the merits and problems with each metric?
- Are there metrics (e.g. balanced accuracy) that scikit-learn should
include, due to their utility and long-standing popularity in the
literature, but does not?
On 24 July 2014 23:49, Hamed Zamani <hamedzam...@acm.org> wrote:
> Dear all,
>
> Thank you very much for all of your quick and informative answers. The
> papers which you introduced really help me.
>
> Cheers,
> Hamed
>
>
>
> On Thu, Jul 24, 2014 at 1:42 AM, Dayvid Victor <victor.d...@gmail.com>
> wrote:
>
>> Wow, I didn't know that. I've seen so many publications (and also used in
>> publications)
>> using this approximation and calling it AUC (including that survey I
>> sent); But it is always
>> good to know the correct terms.
>>
>> Thanks,
>>
>>
>> On Wed, Jul 23, 2014 at 8:32 PM, Mario Michael Krell <kr...@uni-bremen.de
>> > wrote:
>>
>>> Dayvid, as I said, this metric should be called "balanced accuracy" (BA)
>>> to avoid misunderstandings with the real AUC from the ROC curve as stated
>>> in the given reference. I also had my autocorrection on: 1 - FP_rate =
>>> TN_rate and BA = (TP_rate+TN_rate)/2. It is not "another" but the same
>>> evaluation metric with your formula.
>>>
>>> Using AUC for this might result in wrong expectations as for example it
>>> was done in the old WEKA implementation, where no real AUC was calculated
>>> but the BA. Sometimes "simplified AUC" or "one point AUC" is used, too, but
>>> BA is shorter and avoids misunderstandings.
>>>
>>> The concept of the balanced accuracy can be also generalized to an
>>> arbitrary number of classes. If the geometric mean is used instead of the
>>> arithmetic mean, the metric is called "G-Mean". It is a little less
>>> intuitive but is more sensitive to different rates and can be generalized
>>> to more than two classes as well.
>>>
>>> G-Mean = sqrt(TP_rate x TN_rate)
>>>
>>> On 23.07.2014, at 19:46, Dayvid Victor <victor.d...@gmail.com> wrote:
>>>
>>> Mario, as I said, the correct would be:
>>>
>>> - AUC = (1 + TP_rate - FP_rate) / 2
>>>
>>>
>>> But you are also right, that is another evaluation metric stated in
>>> those references I sent!
>>>
>>>
>>>
>>>
>>> On Wed, Jul 23, 2014 at 2:06 PM, Mario Michael Krell <
>>> kr...@uni-bremen.de> wrote:
>>>
>>>> 1-FN_rate = TN_rate
>>>>
>>>> Concequently, (1 + TP_rate - FN_rate)/ 2 should be named "Balanced
>>>> Accuracy" to avoid misunderstandings. Nevertheless, it is a good choice.
>>>>
>>>>
>>>> On 23.07.2014, at 18:57, Dayvid Victor <victor.d...@gmail.com> wrote:
>>>>
>>>>
>>>> Or you might use the trapezoid aproximation: auc = (1 + TP_rate -
>>>> FN_rate)/ 2
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> --
>>> *Dayvid Victor R. de Oliveira*
>>> PhD Candidate in Computer Science at Federal University of Pernambuco
>>> (UFPE)
>>> MSc in Computer Science at Federal University of Pernambuco (UFPE)
>>> BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
>>>
>>> ------------------------------------------------------------------------------
>>> Want fast and easy access to all the code in your enterprise? Index and
>>> search up to 200,000 lines of code with a free copy of Black Duck
>>> Code Sight - the same software that powers the world's largest code
>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>> http://p.sf.net/sfu/bds_______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Want fast and easy access to all the code in your enterprise? Index and
>>> search up to 200,000 lines of code with a free copy of Black Duck
>>> Code Sight - the same software that powers the world's largest code
>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>> http://p.sf.net/sfu/bds
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> --
>> *Dayvid Victor R. de Oliveira*
>> PhD Candidate in Computer Science at Federal University of Pernambuco
>> (UFPE)
>> MSc in Computer Science at Federal University of Pernambuco (UFPE)
>> BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
>>
>>
>> ------------------------------------------------------------------------------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general