spark git commit: [SPARK-18793][SPARK-18794][R] add spark.randomForest/spark.gbt to vignettes

2016-12-13 Thread meng
Repository: spark
Updated Branches:
  refs/heads/branch-2.1 25b97589e -> 5693ac8e5


[SPARK-18793][SPARK-18794][R] add spark.randomForest/spark.gbt to vignettes

## What changes were proposed in this pull request?

Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content 
minimal since users can type `?spark.randomForest` to see the full doc.

cc: jkbradley

Author: Xiangrui Meng 

Closes #16264 from mengxr/SPARK-18793.

(cherry picked from commit 594b14f1ebd0b3db9f630e504be92228f11b4d9f)
Signed-off-by: Xiangrui Meng 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5693ac8e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5693ac8e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5693ac8e

Branch: refs/heads/branch-2.1
Commit: 5693ac8e5bd5df8aca1b0d6df0be072a45abcfbd
Parents: 25b9758
Author: Xiangrui Meng 
Authored: Tue Dec 13 16:59:09 2016 -0800
Committer: Xiangrui Meng 
Committed: Tue Dec 13 16:59:15 2016 -0800

--
 R/pkg/vignettes/sparkr-vignettes.Rmd | 32 +++
 1 file changed, 32 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5693ac8e/R/pkg/vignettes/sparkr-vignettes.Rmd
--
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 625b759..334daa5 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -449,6 +449,10 @@ SparkR supports the following machine learning models and 
algorithms.
 
 * Generalized Linear Model (GLM)
 
+* Random Forest
+
+* Gradient-Boosted Trees (GBT)
+
 * Naive Bayes Model
 
 * $k$-means Clustering
@@ -526,6 +530,34 @@ gaussianFitted <- predict(gaussianGLM, carsDF)
 head(select(gaussianFitted, "model", "prediction", "mpg", "wt", "hp"))
 ```
 
+ Random Forest
+
+`spark.randomForest` fits a [random 
forest](https://en.wikipedia.org/wiki/Random_forest) classification or 
regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to 
make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+In the following example, we use the `longley` dataset to train a random 
forest and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+rfModel <- spark.randomForest(df, Employed ~ ., type = "regression", maxDepth 
= 2, numTrees = 2)
+summary(rfModel)
+predictions <- predict(rfModel, df)
+```
+
+ Gradient-Boosted Trees
+
+`spark.gbt` fits a [gradient-boosted 
tree](https://en.wikipedia.org/wiki/Gradient_boosting) classification or 
regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to 
make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+Similar to the random forest example above, we use the `longley` dataset to 
train a gradient-boosted tree and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+gbtModel <- spark.gbt(df, Employed ~ ., type = "regression", maxDepth = 2, 
maxIter = 2)
+summary(gbtModel)
+predictions <- predict(gbtModel, df)
+```
+
  Naive Bayes Model
 
 Naive Bayes model assumes independence among the features. `spark.naiveBayes` 
fits a [Bernoulli naive Bayes 
model](https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Bernoulli_naive_Bayes)
 against a SparkDataFrame. The data should be all categorical. These models are 
often used for document classification.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-18793][SPARK-18794][R] add spark.randomForest/spark.gbt to vignettes

2016-12-13 Thread meng
Repository: spark
Updated Branches:
  refs/heads/master c68fb426d -> 594b14f1e


[SPARK-18793][SPARK-18794][R] add spark.randomForest/spark.gbt to vignettes

## What changes were proposed in this pull request?

Mention `spark.randomForest` and `spark.gbt` in vignettes. Keep the content 
minimal since users can type `?spark.randomForest` to see the full doc.

cc: jkbradley

Author: Xiangrui Meng 

Closes #16264 from mengxr/SPARK-18793.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/594b14f1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/594b14f1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/594b14f1

Branch: refs/heads/master
Commit: 594b14f1ebd0b3db9f630e504be92228f11b4d9f
Parents: c68fb42
Author: Xiangrui Meng 
Authored: Tue Dec 13 16:59:09 2016 -0800
Committer: Xiangrui Meng 
Committed: Tue Dec 13 16:59:09 2016 -0800

--
 R/pkg/vignettes/sparkr-vignettes.Rmd | 32 +++
 1 file changed, 32 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/594b14f1/R/pkg/vignettes/sparkr-vignettes.Rmd
--
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 625b759..334daa5 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -449,6 +449,10 @@ SparkR supports the following machine learning models and 
algorithms.
 
 * Generalized Linear Model (GLM)
 
+* Random Forest
+
+* Gradient-Boosted Trees (GBT)
+
 * Naive Bayes Model
 
 * $k$-means Clustering
@@ -526,6 +530,34 @@ gaussianFitted <- predict(gaussianGLM, carsDF)
 head(select(gaussianFitted, "model", "prediction", "mpg", "wt", "hp"))
 ```
 
+ Random Forest
+
+`spark.randomForest` fits a [random 
forest](https://en.wikipedia.org/wiki/Random_forest) classification or 
regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to 
make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+In the following example, we use the `longley` dataset to train a random 
forest and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+rfModel <- spark.randomForest(df, Employed ~ ., type = "regression", maxDepth 
= 2, numTrees = 2)
+summary(rfModel)
+predictions <- predict(rfModel, df)
+```
+
+ Gradient-Boosted Trees
+
+`spark.gbt` fits a [gradient-boosted 
tree](https://en.wikipedia.org/wiki/Gradient_boosting) classification or 
regression model on a `SparkDataFrame`.
+Users can call `summary` to get a summary of the fitted model, `predict` to 
make predictions, and `write.ml`/`read.ml` to save/load fitted models.
+
+Similar to the random forest example above, we use the `longley` dataset to 
train a gradient-boosted tree and make predictions:
+
+```{r, warning=FALSE}
+df <- createDataFrame(longley)
+gbtModel <- spark.gbt(df, Employed ~ ., type = "regression", maxDepth = 2, 
maxIter = 2)
+summary(gbtModel)
+predictions <- predict(gbtModel, df)
+```
+
  Naive Bayes Model
 
 Naive Bayes model assumes independence among the features. `spark.naiveBayes` 
fits a [Bernoulli naive Bayes 
model](https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Bernoulli_naive_Bayes)
 against a SparkDataFrame. The data should be all categorical. These models are 
often used for document classification.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org