Dear Joel,
Sorry for the delay. I was in a trip and I couldn't check my email.
To the best of my knowledge and according to the kind responses in this
email thread, we cannot claim that an specific measure is better than the
others for imbalanced data. In other words, there are some evaluation
measure suitable for imbalanced data and each of them has its own
advantages. Hence, choosing the best evaluation measure totally depends on
the application which you are working on.
Anyway, "Matthew's Correlation Coefficient", "AUC of ROC", "F-measure",
"Balanced Accuracy", and generally "Weighted Accuracy" have been used
frequently in the literature. Among these measures, only "balanced
accuracy" is not developed in scikit-learn and I think it is worthwhile to
add it to this library. I have developed it before and if you want I can
add it to the project or send it to you.
Kind Regards,
Hamed
On Fri, Jul 25, 2014 at 6:19 AM, Joel Nothman <joel.noth...@gmail.com>
wrote:
> What is the outcome of this discussion for scikit-learn?
>
> - Would someone be interested in improving the documentation
>
> <http://scikit-learn.org/dev/modules/model_evaluation.html#classification-metrics>
> highlight
> the merits and problems with each metric?
> - Are there metrics (e.g. balanced accuracy) that scikit-learn should
> include, due to their utility and long-standing popularity in the
> literature, but does not?
>
>
>
> On 24 July 2014 23:49, Hamed Zamani <hamedzam...@acm.org> wrote:
>
>> Dear all,
>>
>> Thank you very much for all of your quick and informative answers. The
>> papers which you introduced really help me.
>>
>> Cheers,
>> Hamed
>>
>>
>>
>> On Thu, Jul 24, 2014 at 1:42 AM, Dayvid Victor <victor.d...@gmail.com>
>> wrote:
>>
>>> Wow, I didn't know that. I've seen so many publications (and also used
>>> in publications)
>>> using this approximation and calling it AUC (including that survey I
>>> sent); But it is always
>>> good to know the correct terms.
>>>
>>> Thanks,
>>>
>>>
>>> On Wed, Jul 23, 2014 at 8:32 PM, Mario Michael Krell <
>>> kr...@uni-bremen.de> wrote:
>>>
>>>> Dayvid, as I said, this metric should be called "balanced accuracy"
>>>> (BA) to avoid misunderstandings with the real AUC from the ROC curve as
>>>> stated in the given reference. I also had my autocorrection on: 1 - FP_rate
>>>> = TN_rate and BA = (TP_rate+TN_rate)/2. It is not "another" but the same
>>>> evaluation metric with your formula.
>>>>
>>>> Using AUC for this might result in wrong expectations as for example it
>>>> was done in the old WEKA implementation, where no real AUC was calculated
>>>> but the BA. Sometimes "simplified AUC" or "one point AUC" is used, too, but
>>>> BA is shorter and avoids misunderstandings.
>>>>
>>>> The concept of the balanced accuracy can be also generalized to an
>>>> arbitrary number of classes. If the geometric mean is used instead of the
>>>> arithmetic mean, the metric is called "G-Mean". It is a little less
>>>> intuitive but is more sensitive to different rates and can be generalized
>>>> to more than two classes as well.
>>>>
>>>> G-Mean = sqrt(TP_rate x TN_rate)
>>>>
>>>> On 23.07.2014, at 19:46, Dayvid Victor <victor.d...@gmail.com> wrote:
>>>>
>>>> Mario, as I said, the correct would be:
>>>>
>>>> - AUC = (1 + TP_rate - FP_rate) / 2
>>>>
>>>>
>>>> But you are also right, that is another evaluation metric stated in
>>>> those references I sent!
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 23, 2014 at 2:06 PM, Mario Michael Krell <
>>>> kr...@uni-bremen.de> wrote:
>>>>
>>>>> 1-FN_rate = TN_rate
>>>>>
>>>>> Concequently, (1 + TP_rate - FN_rate)/ 2 should be named "Balanced
>>>>> Accuracy" to avoid misunderstandings. Nevertheless, it is a good choice.
>>>>>
>>>>>
>>>>> On 23.07.2014, at 18:57, Dayvid Victor <victor.d...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> Or you might use the trapezoid aproximation: auc = (1 + TP_rate -
>>>>> FN_rate)/ 2
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>>> Code Sight - the same software that powers the world's largest code
>>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>>> http://p.sf.net/sfu/bds
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Dayvid Victor R. de Oliveira*
>>>> PhD Candidate in Computer Science at Federal University of Pernambuco
>>>> (UFPE)
>>>> MSc in Computer Science at Federal University of Pernambuco (UFPE)
>>>> BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds_______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> --
>>> *Dayvid Victor R. de Oliveira*
>>> PhD Candidate in Computer Science at Federal University of Pernambuco
>>> (UFPE)
>>> MSc in Computer Science at Federal University of Pernambuco (UFPE)
>>> BSc in Computer Engineering - Federal University of Pernambuco (UFPE)
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Want fast and easy access to all the code in your enterprise? Index and
>>> search up to 200,000 lines of code with a free copy of Black Duck
>>> Code Sight - the same software that powers the world's largest code
>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>> http://p.sf.net/sfu/bds
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general