[GitHub] [spark] zero323 commented on a change in pull request #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

GitBox Tue, 07 Apr 2020 07:53:25 -0700

zero323 commented on a change in pull request #27593: [SPARK-30818][SPARKR][ML] 
Add SparkR LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#discussion_r404872880


 ##########
 File path: R/pkg/R/mllib_regression.R
 ##########
 @@ -540,3 +546,149 @@ setMethod("write.ml", signature(object = 
"AFTSurvivalRegressionModel", path = "c
           function(object, path, overwrite = FALSE) {
             write_internal(object, path, overwrite)
           })
+
+#' Linear Regression Model
+#'
+#' \code{spark.lm} fits a linear regression model against a SparkDataFrame.
+#' Users can call \code{summary} to print a summary of the fitted model,
+#' \code{predict} to make predictions on new data,
+#' and \code{write.ml}/\code{read.ml} to save/load fitted models.
+#'
+#' @param data a \code{SparkDataFrame} of observations and labels for model 
fitting.
+#' @param formula a symbolic description of the model to be fitted. Currently 
only a few formula
+#'                operators are supported, including '~', '.', ':', '+', and 
'-'.
+#' @param maxIter maximum iteration number.
+#' @param regParam the regularization parameter.
+#' @param elasticNetParam the ElasticNet mixing parameter, in range [0, 1].
+#'        For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an 
L1 penalty.
+#' @param tol convergence tolerance of iterations.
+#' @param standardization whether to standardize the training features before 
fitting the model.
+#' @param weightCol weight column name.
+#' @param aggregationDepth suggested depth for treeAggregate (>= 2).
+#' @param loss the loss function to be optimized. Supported options: 
"squaredError" and "huber".
+#' @param epsilon the shape parameter to control the amount of robustness.
+#' @param solver The solver algorithm for optimization.
+#'        Supported options: "l-bfgs", "normal" and "auto".
+#' @param stringIndexerOrderType how to order categories of a string feature 
column. This is used to
+#'                               decide the base level of a string feature as 
the last category
+#'                               after ordering is dropped when encoding 
strings. Supported options
+#'                               are "frequencyDesc", "frequencyAsc", 
"alphabetDesc", and
+#'                               "alphabetAsc". The default value is 
"frequencyDesc". When the
+#'                               ordering is set to "alphabetDesc", this drops 
the same category
+#'                               as R when encoding strings.
+#' @param ... additional arguments passed to the method.
+#' @return \code{spark.lm} returns a fitted Linear Regression Model.
+#'
 
 Review comment:
   In general blank lines are acceptable and even recommended after long block 
(special case is a header of the comment, where it is meaningful. But I am fine 
with removing it. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on a change in pull request #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

Reply via email to