GitHub user dongwang218 opened a pull request:

    https://github.com/apache/spark/pull/643

    [MLLIB] SPARK-1682: Add gradient descent w/o sampling and RDA L1 updater

    The GradientDescent optimizer does sampling before a gradient step. When 
input data is already shuffled beforehand, it is possible to scan data and make 
gradient descent for each data instance. This could be potentially more 
efficient.
    
    Add enhanced RDA L1 updater, which could produce even sparse solutions with 
comparable quality compared with L1. Reference: 
    Lin Xiao, "Dual Averaging Methods for Regularized Stochastic Learning and 
Online Optimization", Journal of Machine Learning Research 11 (2010) 2543-2596.
    
    Small fix: add options to BinaryClassification example to read and write 
model file

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongwang218/spark lr_svmlight

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #643
    
----
commit 50cdd69e7f8ebfa047a3b76efcc3ffb5e82b4cf7
Author: Dong Wang <[email protected]>
Date:   2014-05-01T00:36:28Z

    enable LogisticRegressionWithSGD to support svmlight data and gradient 
descent w/o sampling

commit 3131478826e1b943b2fd8fb02839d7b8df9b5377
Author: Dong Wang <[email protected]>
Date:   2014-05-01T18:54:23Z

    small fix for scalastyle

commit 5e6f5c43327aeea1978bde10f8621e156a9680f9
Author: Dong Wang <[email protected]>
Date:   2014-05-01T20:38:26Z

    Merge remote-tracking branch 'upstream/master' into lr_svmlight
    
    Conflicts:
        
mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala

commit 96926db5488288bc6713d7be267e9adbe811b2f2
Author: Dong Wang <[email protected]>
Date:   2014-05-03T00:51:47Z

    add enhanced l1-RDA

commit 76c4d600b35becf124f02e3f0ed3ff9d9ae67a18
Author: Dong Wang <[email protected]>
Date:   2014-05-05T03:29:38Z

    add more options to BinaryClassification example

commit 87f96a269a5bebcaf45339007e9da57be51fa418
Author: Dong Wang <[email protected]>
Date:   2014-05-05T03:49:10Z

    Merge remote-tracking branch 'upstream/master' into lr_svmlight

commit 391d4bce492ef908fd6a21467e895368a85c2f10
Author: Dong Wang <[email protected]>
Date:   2014-05-05T04:03:34Z

    small fix: break long line

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to