Repository: spark-website Updated Branches: refs/heads/asf-site 8d5d77c65 -> a82adf043
updated MLlib site for 2.1 Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/057cad18 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/057cad18 Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/057cad18 Branch: refs/heads/asf-site Commit: 057cad18d2bfbc16b11837ca9614927444b7ac47 Parents: 8d5d77c Author: Joseph K. Bradley <joseph.kurata.brad...@gmail.com> Authored: Wed Dec 7 12:12:33 2016 -0800 Committer: Joseph K. Bradley <joseph.kurata.brad...@gmail.com> Committed: Wed Dec 7 12:12:33 2016 -0800 ---------------------------------------------------------------------- mllib/index.md | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark-website/blob/057cad18/mllib/index.md ---------------------------------------------------------------------- diff --git a/mllib/index.md b/mllib/index.md index 9c43750..bccd603 100644 --- a/mllib/index.md +++ b/mllib/index.md @@ -31,7 +31,7 @@ subproject: MLlib data = spark.read.format(<span class="string">"libsvm"</span>)\<br/> .load(<span class="string">"hdfs://..."</span>)<br/> <br/> - model = <span class="sparkop">KMeans</span>(data, k=10) + model = <span class="sparkop">KMeans</span>(k=10).fit(data) </div> <div class="caption">Calling MLlib in Python</div> </div> @@ -81,25 +81,37 @@ subproject: MLlib <div class="col-md-4 col-padded"> <h3>Algorithms</h3> <p> - MLlib contains many algorithms and utilities, including: + MLlib contains many algorithms and utilities. + </p> + <p> + ML algorithms include: </p> <ul class="list-narrow"> <li>Classification: logistic regression, naive Bayes,...</li> - <li>Regression: generalized linear regression, isotonic regression,...</li> + <li>Regression: generalized linear regression, survival regression,...</li> <li>Decision trees, random forests, and gradient-boosted trees</li> <li>Recommendation: alternating least squares (ALS)</li> <li>Clustering: K-means, Gaussian mixtures (GMMs),...</li> <li>Topic modeling: latent Dirichlet allocation (LDA)</li> + <li>Frequent itemsets, association rules, and sequential pattern mining</li> + </ul> + <p> + ML workflow utilities include: + </p> + <ul class="list-narrow"> <li>Feature transformations: standardization, normalization, hashing,...</li> - <li>Model evaluation and hyper-parameter tuning</li> <li>ML Pipeline construction</li> + <li>Model evaluation and hyper-parameter tuning</li> <li>ML persistence: saving and loading models and Pipelines</li> - <li>Survival analysis: accelerated failure time model</li> - <li>Frequent itemset and sequential pattern mining: FP-growth, association rules, PrefixSpan</li> - <li>Distributed linear algebra: singular value decomposition (SVD), principal component analysis (PCA),...</li> + </ul> + <p> + Other utilities include: + </p> + <ul class="list-narrow"> + <li>Distributed linear algebra: SVD, PCA,...</li> <li>Statistics: summary statistics, hypothesis testing,...</li> </ul> - <p>Refer to the <a href="{{site.baseurl}}/docs/latest/mllib-guide.html">MLlib guide</a> for usage examples.</p> + <p>Refer to the <a href="{{site.baseurl}}/docs/latest/ml-guide.html">MLlib guide</a> for usage examples.</p> </div> <div class="col-md-4 col-padded"> @@ -126,7 +138,7 @@ subproject: MLlib </p> <ul class="list-narrow"> <li><a href="{{site.baseurl}}/downloads.html">Download Spark</a>. MLlib is included as a module.</li> - <li>Read the <a href="{{site.baseurl}}/docs/latest/mllib-guide.html">MLlib guide</a>, which includes + <li>Read the <a href="{{site.baseurl}}/docs/latest/ml-guide.html">MLlib guide</a>, which includes various usage examples.</li> <li>Learn how to <a href="{{site.baseurl}}/docs/latest/#launching-on-a-cluster">deploy</a> Spark on a cluster if you'd like to run in distributed mode. You can also run locally on a multicore machine --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org