Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/16301#discussion_r92855395
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -496,9 +508,114 @@ count(carsDF_test)
head(carsDF_test)
```
-
### Models and Algorithms
+#### Logistic Regression Model
+
+[Logistic regression](https://en.wikipedia.org/wiki/Logistic_regression)
is a widely-used model when the response is categorical. It can be seen as a
special case of the [Generalized Linear Predictive
Model](https://en.wikipedia.org/wiki/Generalized_linear_model).
+We provide `spark.logit` on top of `spark.glm` to support logistic
regression with advanced hyper-parameters.
+It supports both binary and multiclass classification with elastic-net
regularization and feature standardization, similar to `glmnet`.
+
+We use a simple example to demonstrate `spark.logit` usage. In general,
there are three steps of using `spark.logit`:
+1). Create a dataframe from a proper data source; 2). Fit a logistic
regression model using `spark.logit` with a proper parameter setting;
+and 3). Obtain the coefficient matrix of the fitted model using `summary`
and use the model for prediction with `predict`.
+
+Binomial logistic regression
+```{r, warning=FALSE}
+df <- createDataFrame(iris)
+# Create a DataFrame containing two classes
+training <- df[df$Species %in% c("versicolor", "virginica"), ]
+model <- spark.logit(training, Species ~ ., regParam = 0.00042)
+summary(model)
+```
+
+Predict values on training data
+```{r}
+fitted <- predict(model, training)
+```
+
+Multinomial logistic regression against three classes
+```{r, warning=FALSE}
+df <- createDataFrame(iris)
+# Note in this case, Spark infers it is multinomial logistic regression,
so family = "multinomial" is optional.
+model <- spark.logit(df, Species ~ ., regParam = 0.056)
+summary(model)
+```
+
+#### Multilayer Perceptron
--- End diff --
I think `MLP` is more commonly used in R.
re: the other question, I think we should try to stick with the convention
in R and not add "model" in the names of everyone
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]