GitHub user holdenk opened a pull request:
https://github.com/apache/spark/pull/6771
[Spark-7780][MLLIB] Intercept in logisticregressionwith lbfgs should not be
regularized no round trip through data frames
This PR is similar to https://github.com/apache/spark/pull/6386 but avoids
the round trip through dataframes. On the other hand it might be a little less
clean.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/holdenk/spark
SPARK-7780-intercept-in-logisticregressionwithLBFGS-should-not-be-regularized-no-rt
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/6771.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #6771
----
commit a529c013fa722748cbd1d3878e4ea3bed5b15181
Author: Holden Karau <[email protected]>
Date: 2015-05-22T20:54:59Z
document plans
commit f9e26350d15d7d36b75ece4f4718797dbe2a0830
Author: Holden Karau <[email protected]>
Date: 2015-05-22T22:53:29Z
Some progress.
commit 7ebbd566e20923efc32dee1cfcf12ea315259e30
Author: Holden Karau <[email protected]>
Date: 2015-05-22T23:16:18Z
Keep track of the number of requested classes so that if its more than 2 we
use the legacy implementation. Also allow pass through of initialWeights
commit ef2a9b0f5b6cb2e971c2e5371f3394b4dec64574
Author: Holden Karau <[email protected]>
Date: 2015-05-22T23:48:06Z
Expose a train on instances method within Spark, use numOfLinearPredictors
instead of keeping track of class variable, pass through persistence information
commit 407491e38b1a5834d26a137ab20829a3d96f5142
Author: Holden Karau <[email protected]>
Date: 2015-05-24T01:14:04Z
tests are fun
commit e02bf3a9688d1efa2f3da60b3d9f27911b04955d
Author: Holden Karau <[email protected]>
Date: 2015-05-24T07:42:13Z
Start updating the tests to run with different updaters.
commit 8517539d0e8829833968dcb7e47ad8ba20849cb1
Author: Holden Karau <[email protected]>
Date: 2015-05-24T08:00:36Z
get the tests compiling
commit a619d42b821575afd8efa90f2a38edf9690eb0df
Author: Holden Karau <[email protected]>
Date: 2015-05-24T08:04:53Z
style fixed
commit 4febcc32f524edadeb68dc674e2681a087ffaa38
Author: Holden Karau <[email protected]>
Date: 2015-05-24T08:13:23Z
make the test method private
commit e8e03a13ba04c6b3100e290a5c435959c2f01912
Author: Holden Karau <[email protected]>
Date: 2015-05-24T20:16:13Z
CR feedback, pass RDD of Labeled points to ml implemetnation. Also from
tests require that feature scaling is turned on to use ml implementation.
commit 38a024bd9a36e83ef8005a5f2af8a4dd44f6760e
Author: Holden Karau <[email protected]>
Date: 2015-05-25T07:24:21Z
Convert it to a df and use set for the inital params
commit 478b8c5d5ff20478dc4ba913b0c77172e0abdfff
Author: Holden Karau <[email protected]>
Date: 2015-05-25T20:06:57Z
Handle non-dense weights
commit 08589f58b81bc1e6099b425f86226053c5b6a360
Author: Holden Karau <[email protected]>
Date: 2015-05-26T03:39:54Z
CR feedback: make the setInitialWeights function private, don't mess with
the weights when they are user supploed, validate that the user supplied
weights are reasonable.
commit f40c401496ae1e6cc7b39db820fea194d42c25c5
Author: Holden Karau <[email protected]>
Date: 2015-05-26T04:19:46Z
style fix up
commit f35a16aa8110a33c32959db674908d145be6e97f
Author: Holden Karau <[email protected]>
Date: 2015-06-02T23:29:11Z
Copy the number of iterations, convergence tolerance, and if we are fitting
an intercept from mllib to ml when training lbfgs model using ml code
commit 4d431a358074f5245abcbc95af3e2bdf75b4f21d
Author: Holden Karau <[email protected]>
Date: 2015-06-03T00:39:48Z
scala style check issue
commit 7e4192849efc6d282633159a15c7dd41376aa1a3
Author: Holden Karau <[email protected]>
Date: 2015-06-03T07:30:48Z
Only the weights if we need to.
commit ed351ffdf862994389b41284f95aa148c6550f41
Author: Holden Karau <[email protected]>
Date: 2015-06-03T19:39:56Z
Use appendBias for adding intercept to initial weights , fix
generateInitialWeights
commit 3ac02d72cab72b35b7cc76c50d7088d4b98bfd9d
Author: Holden Karau <[email protected]>
Date: 2015-06-08T20:20:19Z
Merge in master
commit 1793ff99e8a750cde609be4d1290770825a1219b
Author: Holden Karau <[email protected]>
Date: 2015-06-11T23:54:52Z
Try and avoid doing the round trip through dataframes
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]