GitHub user dongwang218 opened a pull request:
https://github.com/apache/spark/pull/643
[MLLIB] SPARK-1682: Add gradient descent w/o sampling and RDA L1 updater
The GradientDescent optimizer does sampling before a gradient step. When
input data is already shuffled beforehand, it is possible to scan data and make
gradient descent for each data instance. This could be potentially more
efficient.
Add enhanced RDA L1 updater, which could produce even sparse solutions with
comparable quality compared with L1. Reference:
Lin Xiao, "Dual Averaging Methods for Regularized Stochastic Learning and
Online Optimization", Journal of Machine Learning Research 11 (2010) 2543-2596.
Small fix: add options to BinaryClassification example to read and write
model file
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongwang218/spark lr_svmlight
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/643.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #643
----
commit 50cdd69e7f8ebfa047a3b76efcc3ffb5e82b4cf7
Author: Dong Wang <[email protected]>
Date: 2014-05-01T00:36:28Z
enable LogisticRegressionWithSGD to support svmlight data and gradient
descent w/o sampling
commit 3131478826e1b943b2fd8fb02839d7b8df9b5377
Author: Dong Wang <[email protected]>
Date: 2014-05-01T18:54:23Z
small fix for scalastyle
commit 5e6f5c43327aeea1978bde10f8621e156a9680f9
Author: Dong Wang <[email protected]>
Date: 2014-05-01T20:38:26Z
Merge remote-tracking branch 'upstream/master' into lr_svmlight
Conflicts:
mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala
commit 96926db5488288bc6713d7be267e9adbe811b2f2
Author: Dong Wang <[email protected]>
Date: 2014-05-03T00:51:47Z
add enhanced l1-RDA
commit 76c4d600b35becf124f02e3f0ed3ff9d9ae67a18
Author: Dong Wang <[email protected]>
Date: 2014-05-05T03:29:38Z
add more options to BinaryClassification example
commit 87f96a269a5bebcaf45339007e9da57be51fa418
Author: Dong Wang <[email protected]>
Date: 2014-05-05T03:49:10Z
Merge remote-tracking branch 'upstream/master' into lr_svmlight
commit 391d4bce492ef908fd6a21467e895368a85c2f10
Author: Dong Wang <[email protected]>
Date: 2014-05-05T04:03:34Z
small fix: break long line
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---