Github user xiejuncs commented on the pull request:

    https://github.com/apache/spark/pull/1155#issuecomment-46772612
  
    Nice work.
    
    I am reading the implementation of MulticlassMetrics. According to your 
code, for Micro average, you calculate the recall and then let precision and f1 
measure equal to the recall. I am not sure whether this makes sense. 
    
    According to this post: 
http://rushdishams.blogspot.com/2011/08/micro-and-macro-average-of-precision.html
    
    Assume we just have three classes. For each class, we have three numbers, 
true positive(tp), false positive(fp), false negative(fn). Hence, we have tp1, 
fp1 and fn1 for class 1. so on so forth.
    
    For Micro-Average Precision: (tp1 + tp2 + tp3) / (tp1 + tp2 + tp3 + fp1 + 
fp2 + fp3)
    For Micro-Average Recall: (tp1 + tp2 + tp3) / (tp1 + tp2 + tp3 + fn1 + fn2 
+ fn3)
    For Micro-Average F1Measure: it is just the harmonic mean of precision and 
recall.
    
    Based on the above definition, recall and precision should not be the same. 
Is it correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to