Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/7655#issuecomment-125345480
Sean,
I added a bit of background on things like TP, FP, precision, recall, ROC,
etc... to the guide. I tried to explain the base concepts for classification
since the different flavors of classification algo metrics basically just
extend the basic ideas of precision, recall, etc. I also added an explanation
of each of the ranking metric algorithms since those are not as well
defined/easy to find on the internet. Additionally, I added hyperlinks to
further reading on these topics via wikipedia.
I left all the math definitions; I wasn't clear if you were suggesting that
we only leave some of them in or just supplement them with explanations. I
didn't think it made a ton of sense to define some of them but not all. Also, I
find it useful to see mathematical representations of what the algorithms take
as parameters. Let me know if you'd like to remove more of the math (the
notation can get a bit heavy) or if you think it's too wordy.
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]