Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/7655#discussion_r35480987
--- Diff: docs/mllib-metrics.md ---
@@ -0,0 +1,1464 @@
+---
+layout: global
+title: Evaluation Metrics - MLlib
+displayTitle: <a href="mllib-guide.html">MLlib</a> - Evaluation Metrics
+---
+
+* Table of contents
+{:toc}
+
+
+## Algorithm Metrics
+
+Spark's MLlib comes with a number of machine learning algorithms that can
be used to learn from and make predictions
+on data. When applying these algorithms, there is a need to evaluate their
performance on certain criteria, depending
+on the application and its requirements. Spark's MLlib also provides a
suite of metrics for the purpose of evaluating the
+performance of its algorithms.
+
+Specific machine learning algorithms fall under broader types of machine
learning applications like classification,
+regression, clustering, etc. Each of these types have well established
metrics for performance evaluation and those
+metrics that are currently available in Spark's MLlib are detailed in this
section.
+
+## Binary Classification
+
+[Binary classifiers](https://en.wikipedia.org/wiki/Binary_classification)
are used to separate the elements of a given
+dataset into one of two possible groups (e.g. fraud or not fraud) and is a
special case of multiclass classification.
+Most binary classification metrics can be generalized to multiclass
classification metrics.
+
+<table class="table">
+ <thead>
+ <tr><th>Metric</th><th>Definition</th></tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>Precision (Postive Predictive Value)</td>
--- End diff --
Terms like TP aren't explained and won't be obvious to the reader that
turns to this document, I think, for an explanation. The mathematical
definition of things like ROC is OK but this offers no intuition about them. I
don't think we need to reproduce a text on what they mean, but at least a
hyperlink to wikipedia?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]