[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

dbtsai Mon, 14 Apr 2014 17:48:08 -0700

GitHub user dbtsai reopened a pull request:

    https://github.com/apache/spark/pull/353


    [SPARK-1157][MLlib] L-BFGS Optimizer based on Breeze's implementation.

    This PR uses Breeze's L-BFGS implement, and Breeze dependency has already 
been introduced by Xiangrui's sparse input format work in SPARK-1212. Nice 
work, @mengxr !
    
    When use with regularized updater, we need compute the regVal and 
regGradient (the gradient of regularized part in the cost function), and in the 
currently updater design, we can compute those two values by the following way.
    
    Let's review how updater works when returning newWeights given the input 
parameters.
    
    w' = w - thisIterStepSize * (gradient + regGradient(w))  Note that 
regGradient is function of w!
    If we set gradient = 0, thisIterStepSize = 1, then
    regGradient(w) = w - w'
    
    As a result, for regVal, it can be computed by 
    
        val regVal = updater.compute(
          weights,
          new DoubleMatrix(initialWeights.length, 1), 0, 1, regParam)._2
    and for regGradient, it can be obtained by
    
          val regGradient = weights.sub(
            updater.compute(weights, new DoubleMatrix(initialWeights.length, 
1), 1, 1, regParam)._1)
    
    The PR includes the tests which compare the result with SGD with/without 
regularization.
    
    We did a comparison between LBFGS and SGD, and often we saw 10x less
    steps in LBFGS while the cost of per step is the same (just computing
    the gradient).
    
    The following is the paper by Prof. Ng at Stanford comparing different
    optimizers including LBFGS and SGD. They use them in the context of
    deep learning, but worth as reference.
    http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dbtsai/spark dbtsai-LBFGS

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/353.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #353
    
----
commit 984b18e21396eae84656e15da3539ff3b5f3bf4a
Author: DB Tsai <[email protected]>
Date:   2014-04-05T00:06:50Z

    L-BFGS Optimizer based on Breeze's implementation. Also fixed indentation 
issue in GradientDescent optimizer.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

Reply via email to