[GitHub] spark pull request #20446: [SPARK-23254][ML] Add user guide entry for DataFr...

MLnick Thu, 01 Feb 2018 06:00:15 -0800

Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20446#discussion_r165362148
  
    --- Diff: docs/ml-statistics.md ---
    @@ -89,4 +89,26 @@ Refer to the [`ChiSquareTest` Python 
docs](api/python/index.html#pyspark.ml.stat
     {% include_example python/ml/chi_square_test_example.py %}
     </div>
     
    +</div>
    +
    +## Summarizer
    +
    +We provide vector column summary statistics for `Dataframe` through 
`Summarizer`.
    +Available metrics contain the column-wise max, min, mean, variance, and 
number of nonzeros, as well as the total count.
    +
    +<div class="codetabs">
    +<div data-lang="scala" markdown="1">
    +[`Summarizer`](api/scala/index.html#org.apache.spark.ml.stat.Summarizer$)
    --- End diff --
    
    Perhaps "The following example demonstrates using `Summarizer`(...) to 
compute the mean and variance for the input dataframe, with and without a 
weight column"?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20446: [SPARK-23254][ML] Add user guide entry for DataFr...

Reply via email to