Github user feynmanliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8517#discussion_r38265268
  
    --- Diff: docs/ml-guide.md ---
    @@ -24,61 +24,74 @@ title: Spark ML Programming Guide
     The `spark.ml` package aims to provide a uniform set of high-level APIs 
built on top of
     [DataFrames](sql-programming-guide.html#dataframes) that help users create 
and tune practical
     machine learning pipelines.
    -See the [Algorithm Guides section](#algorithm-guides) below for guides on 
sub-packages of
    +See the [algorithm guides](#algorithm-guides) section below for guides on 
sub-packages of
     `spark.ml`, including feature transformers unique to the Pipelines API, 
ensembles, and more.
     
    -**Table of Contents**
    +**Table of contents**
     
     * This will become a table of contents (this text will be scraped).
     {:toc}
     
    -# Main Concepts
    +# Main concepts
     
    -Spark ML standardizes APIs for machine learning algorithms to make it 
easier to combine multiple algorithms into a single pipeline, or workflow.  
This section covers the key concepts introduced by the Spark ML API.
    +Spark ML standardizes APIs for machine learning algorithms to make it 
easier to combine multiple
    +algorithms into a single pipeline, or workflow.
    +This section covers the key concepts introduced by the Spark ML API, where 
the pipeline concept is
    +mostly inspired by the [scikit-learn](http://scikit-learn.org/) project.
     
    -* **[ML Dataset](ml-guide.html#ml-dataset)**: Spark ML uses the 
[`DataFrame`](api/scala/index.html#org.apache.spark.sql.DataFrame) from Spark 
SQL as a dataset which can hold a variety of data types.
    -E.g., a dataset could have different columns storing text, feature 
vectors, true labels, and predictions.
    +* **[`DataFrame`](ml-guide.html#dataframe)**: Spark ML uses `DataFrame` 
from Spark SQL as an ML
    +  dataset, which can hold a variety of data types.
    +  E.g., a `DataFrame` could have different columns storing text, feature 
vectors, true labels, and predictions.
     
     * **[`Transformer`](ml-guide.html#transformers)**: A `Transformer` is an 
algorithm which can transform one `DataFrame` into another `DataFrame`.
    -E.g., an ML model is a `Transformer` which transforms an RDD with features 
into an RDD with predictions.
    +E.g., an ML model is a `Transformer` which transforms `DataFrame` with 
features into a `DataFrame` with predictions.
     
     * **[`Estimator`](ml-guide.html#estimators)**: An `Estimator` is an 
algorithm which can be fit on a `DataFrame` to produce a `Transformer`.
    -E.g., a learning algorithm is an `Estimator` which trains on a dataset and 
produces a model.
    +E.g., a learning algorithm is an `Estimator` which trains on a `DataFrame` 
and produces a model.
     
     * **[`Pipeline`](ml-guide.html#pipeline)**: A `Pipeline` chains multiple 
`Transformer`s and `Estimator`s together to specify an ML workflow.
     
    -* **[`Param`](ml-guide.html#parameters)**: All `Transformer`s and 
`Estimator`s now share a common API for specifying parameters.
    +* **[`Parameter`](ml-guide.html#parameters)**: All `Transformer`s and 
`Estimator`s now share a common API for specifying parameters.
     
    -## ML Dataset
    +## DataFrame
     
     Machine learning can be applied to a wide variety of data types, such as 
vectors, text, images, and structured data.
    -Spark ML adopts the 
[`DataFrame`](api/scala/index.html#org.apache.spark.sql.DataFrame) from Spark 
SQL in order to support a variety of data types under a unified Dataset concept.
    +Spark ML adopts the `DataFrame` from Spark SQL in order to support a 
variety of data types.
    --- End diff --
    
    nit: `spark.ml`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to