Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/702#discussion_r12460502
--- Diff: docs/mllib-optimization.md ---
@@ -163,3 +177,100 @@ each iteration, to compute the gradient direction.
Available algorithms for gradient descent:
*
[GradientDescent.runMiniBatchSGD](api/mllib/index.html#org.apache.spark.mllib.optimization.GradientDescent)
+
+### Limited-memory BFGS
+L-BFGS is currently only a low-level optimization primitive in `MLlib`. If
you want to use L-BFGS in various
+ML algorithms such as Linear Regression, and Logistic Regression, you have
to pass the gradient of objective
+function, and updater into optimizer yourself instead of using the
training APIs like
+[LogisticRegression.LogisticRegressionWithSGD](api/mllib/index.html#org.apache.spark.mllib.classification.LogisticRegression).
+See the example below. It will be addressed in the next release.
+
+The L1 regularization by using
+[Updater.L1Updater](api/mllib/index.html#org.apache.spark.mllib.optimization.Updater)
will not work since the
+soft-thresholding logic in L1Updater is designed for gradient descent.
+
+The L-BFGS method
+[LBFGS.runLBFGS](api/scala/index.html#org.apache.spark.mllib.optimization.LBFGS)
+has the following parameters:
+
+* `gradient` is a class that computes the gradient of the objective
function
+being optimized, i.e., with respect to a single training example, at the
+current parameter value. MLlib includes gradient classes for common loss
+functions, e.g., hinge, logistic, least-squares. The gradient class takes
as
+input a training example, its label, and the current parameter value.
+* `updater` is a class originally designed for gradient decent which
computes
+the actual gradient descent step. However, we're able to take the gradient
and
+loss of objective function of regularization for L-BFGS by ignoring the
part of logic
+only for gradient decent such as adaptive step size stuff. We will
refactorize
+this into regularizer to replace updater to separate the logic between
+regularization and step update later.
+* `numCorrections` is the number of corrections used in the L-BFGS update.
10 is
+recommended.
+* `maxNumIterations` is the maximal number of iterations that L-BFGS can
be run.
+* `regParam` is the regularization parameter when using regularization.
+* `return` A tuple containing two elements. The first element is a column
matrix
+containing weights for every feature, and the second element is an array
containing
+the loss computed for every iteration.
+
+Here is an example to train binary logistic regression with L2
regularization using
+L-BFGS optimizer.
+{% highlight scala %}
+import org.apache.spark.SparkContext
+import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
+import org.apache.spark.mllib.linalg.Vectors
+import org.apache.spark.mllib.util.MLUtils
+import org.apache.spark.mllib.classification.LogisticRegressionModel
+import breeze.linalg.{DenseVector => BDV}
+
+val data = MLUtils.loadLibSVMFile(sc, "mllib/data/sample_libsvm_data.txt")
+val numFeatures = data.take(1)(0).features.size
+
+// Split data into training (60%) and test (40%).
+val splits = data.randomSplit(Array(0.6, 0.4), seed = 11L)
+
+// Prepend 1 into the training data as intercept.
+val training = splits(0).map(x =>
+ (x.label, Vectors.fromBreeze(
+ BDV.vertcat(BDV.ones[Double](1), x.features.toBreeze.toDenseVector)))
--- End diff --
I added `MLUtils.appendBias` recently. Could you switch to it?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---