Re: [Scikit-learn-general] Average Per-Class Accuracy metric

Sebastian Raschka Tue, 08 Mar 2016 16:17:17 -0800

I haven’t seen this in practice, yet, either. A colleague was looking for this 
in scikit-learn recently, and he asked me if I know whether this is implemented 
or not. I couldn’t find anything in the docs and was just curious about your 
opinion. However, I just found this entry here on wikipedia:


https://en.wikipedia.org/wiki/Accuracy_and_precision
> Another useful performance measure is the balanced accuracy[10] which avoids 
> inflated performance estimates on imbalanced datasets. It is defined as the 
> arithmetic mean of sensitivity and specificity, or the average accuracy 
> obtained on either class:

> Am I right in thinking that in the binary case, this is identical to 
> accuracy? 


I think it would only be equal to the “accuracy” if the class labels are 
uniformly distributed.

>  I'm not sure what this metric is getting at.

I have to think about this more, but I think it may be useful for imbalanced 
datasets where you want to emphasize the minority class. E.g., let’s say we 
have a dataset of 120 samples and three class labels 1, 2, 3. And the classes 
are distributed like this
10 x 1
50 x 2
60 x 3

Now, let’s assume we have a model that makes the following predictions

- it gets 0 out of 10 from class 1 right
- 45 out of 50 from class 2
- 55 out of 60 from class 3

So, the accuracy would then be computed as

(0 + 45 + 55) / 120 = 0.833

But the “balanced accuracy” would be much lower, because the model did really 
badly on class 1, i.e., 

(0/10 + 45/50 + 55/60) / 3 = 0.61

Hm, if I see this correctly, this is actually very similar to the F1 score. But 
instead of computing the harmonic mean between “precision and the true positive 
rate), we compute the harmonic mean between "precision and true negative rate"

> On Mar 8, 2016, at 6:40 PM, Joel Nothman <joel.noth...@gmail.com> wrote:
> 
> I've not seen this metric used (references?). Am I right in thinking that in 
> the binary case, this is identical to accuracy? If I predict all elements to 
> be the majority class, then adding more minority classes into the problem 
> increases my score. I'm not sure what this metric is getting at.
> 
> On 8 March 2016 at 11:57, Sebastian Raschka <se.rasc...@gmail.com> wrote:
> Hi,
> 
> I was just wondering why there’s no support for the average per-class 
> accuracy in the scorer functions (if I am not overlooking something).
> E.g., we have 'f1_macro', 'f1_micro', 'f1_samples', ‘f1_weighted’ but I 
> didn’t see a ‘accuracy_macro’, i.e.,
> (acc.class_1 + acc.class_2 + … + acc.class_n) / n
> 
> Would you discourage its usage (in favor of other metrics in imbalanced class 
> problems) or was it simply not implemented, yet?
> 
> Best,
> Sebastian
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://makebettercode.com/inteldaal-eval
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://makebettercode.com/inteldaal-eval_______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Average Per-Class Accuracy metric

Reply via email to