Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/702#discussion_r12460483
--- Diff: docs/mllib-optimization.md ---
@@ -163,3 +177,100 @@ each iteration, to compute the gradient direction.
Available algorithms for gradient descent:
*
[GradientDescent.runMiniBatchSGD](api/mllib/index.html#org.apache.spark.mllib.optimization.GradientDescent)
+
+### Limited-memory BFGS
+L-BFGS is currently only a low-level optimization primitive in `MLlib`. If
you want to use L-BFGS in various
+ML algorithms such as Linear Regression, and Logistic Regression, you have
to pass the gradient of objective
+function, and updater into optimizer yourself instead of using the
training APIs like
+[LogisticRegression.LogisticRegressionWithSGD](api/mllib/index.html#org.apache.spark.mllib.classification.LogisticRegression).
+See the example below. It will be addressed in the next release.
+
+The L1 regularization by using
+[Updater.L1Updater](api/mllib/index.html#org.apache.spark.mllib.optimization.Updater)
will not work since the
+soft-thresholding logic in L1Updater is designed for gradient descent.
+
+The L-BFGS method
+[LBFGS.runLBFGS](api/scala/index.html#org.apache.spark.mllib.optimization.LBFGS)
+has the following parameters:
+
+* `gradient` is a class that computes the gradient of the objective
function
+being optimized, i.e., with respect to a single training example, at the
+current parameter value. MLlib includes gradient classes for common loss
+functions, e.g., hinge, logistic, least-squares. The gradient class takes
as
+input a training example, its label, and the current parameter value.
+* `updater` is a class originally designed for gradient decent which
computes
+the actual gradient descent step. However, we're able to take the gradient
and
+loss of objective function of regularization for L-BFGS by ignoring the
part of logic
+only for gradient decent such as adaptive step size stuff. We will
refactorize
+this into regularizer to replace updater to separate the logic between
+regularization and step update later.
+* `numCorrections` is the number of corrections used in the L-BFGS update.
10 is
+recommended.
+* `maxNumIterations` is the maximal number of iterations that L-BFGS can
be run.
+* `regParam` is the regularization parameter when using regularization.
+* `return` A tuple containing two elements. The first element is a column
matrix
--- End diff --
Move `return` out of the list. It may look like a parameter if we put them
together.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---