Hello all, Congratulations on the release of 0.20! My questions are about the updated classification_report: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
Here is the simple example shown in the documentation (apologies for the formatting): >>> from sklearn.metrics import classification_report >>> y_true = [0, 1, 2, 2, 2] >>> y_pred = [0, 0, 2, 2, 1] >>> target_names = ['class 0', 'class 1', 'class 2'] >>> print(classification_report(y_true, y_pred, target_names=target_names)) precision recall f1-score support class 0 0.50 1.00 0.67 1 class 1 0.00 0.00 0.00 1 class 2 1.00 0.67 0.80 3 micro avg 0.60 0.60 0.60 5 macro avg 0.50 0.56 0.49 5 weighted avg 0.70 0.60 0.61 5 I understand how macro average and weighted average are calculated. My questions are in regard to micro average: 1. From this and other examples, it appears to me that "micro average" is identical to classification accuracy. Is that correct? 2. Is there a reason that micro average is listed three times (under the precision, recall, and f1-score columns)? From my understanding, that 0.60 number is being calculated once but is being displayed three times. The display implies (at least in my mind) that 0.60 is being calculated from the three precision numbers, and separately calculated from the three recall numbers, and separately calculated from the three f1-score numbers, which seems misleading. 3. The documentation explains micro average as "averaging the total true positives, false negatives and false positives". If my understanding is correct that micro average is the same as accuracy, then why are true negatives any less relevant to the calculation? (Also, I don't mean to be picky, but "true positives" etc. are whole number counts rather than rates, and so it seems odd to say that you are arriving at a rate by averaging counts.) These may be dumb questions arising from my ignorance... my apologies if so! As well, I don't mean for my questions to criticize the excellent work that has been done by all of the scikit-learn contributors - I deeply appreciate your work! Rather, I'm planning to create a video series explaining some of the new features in 0.20, and I want to make sure that I'm accurately explaining these new features. Thanks very much! Kevin -- Kevin Markham Founder, Data School https://www.dataschool.io https://www.youtube.com/dataschool https://www.patreon.com/dataschool
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn