[GitHub] spark pull request: [SPARK-6129][MLLIB][DOCS] Added user guide for...

sethah Wed, 29 Jul 2015 17:53:17 -0700

Github user sethah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7655#discussion_r35829230
  
    --- Diff: docs/mllib-evaluation-metrics.md ---
    @@ -0,0 +1,1476 @@
    +---
    +layout: global
    +title: Evaluation Metrics - MLlib
    +displayTitle: <a href="mllib-guide.html">MLlib</a> - Evaluation Metrics
    +---
    +
    +* Table of contents
    +{:toc}
    +
    +
    +## Algorithm Metrics
    +
    +Spark's MLlib comes with a number of machine learning algorithms that can 
be used to learn from and make predictions
    +on data. When these algorithms are applied to build machine learning 
models, there is a need to evaluate the performance
    +of the model on some criteria, which depends on the application and its 
requirements. Spark's MLlib also provides a
    +suite of metrics for the purpose of evaluating the performance of machine 
learning models.
    +
    +Specific machine learning algorithms fall under broader types of machine 
learning applications like classification,
    +regression, clustering, etc. Each of these types have well established 
metrics for performance evaluation and those
    +metrics that are currently available in Spark's MLlib are detailed in this 
section.
    +
    +## Classification Model Evaluation
    +
    +While there are many different types of classification algorithms, the 
evaluation of classification models all share
    +similar principles. In a [supervised classification 
problem](https://en.wikipedia.org/wiki/Statistical_classification),
    +there exists a true output and a model-generated predicted output for each 
data point. For this reason, the results for
    +each data point can be assigned to one of four categories:
    +
    +* True Positive (TP) - class predicted by model and class in true output
    --- End diff --
    
    I switched up the explanations to what you have above. I added some 
explanations to the multiclass section to define what a positive and negative 
label mean. May be worth having someone besides me take a look to make sure 
it's not confusing.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-6129][MLLIB][DOCS] Added user guide for...

Reply via email to