GitHub user sethah opened a pull request:
https://github.com/apache/spark/pull/18305
[SPARK-20988][ML] Logistic regression uses aggregator hierarchy
## What changes were proposed in this pull request?
This change pulls the `LogisticAggregator` class out of
LogisticRegression.scala and makes it extend `DifferentiableLossAggregator`. It
also changes logistic regression to use the generic `RDDLossFunction` instead
of having its own.
Other minor changes:
* L2Regularization accepts `Option[Int => Double]` for features standard
deviation
* L2Regularization uses `Vector` type instead of Array
* Some tests added to LeastSquaresAggregator
## How was this patch tested?
Unit test suites are added.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sethah/spark SPARK-20988
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/18305.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #18305
----
commit 64c1d8ba26a4cf852efb09485526364323b26eb3
Author: sethah <[email protected]>
Date: 2017-06-05T20:40:41Z
tests pass
commit a5b18c22a68418c29a13aab7a1fd00eec5176658
Author: sethah <[email protected]>
Date: 2017-06-14T05:57:19Z
passing tests, added tests to leastsquares agg
commit 6edd12893a255c1f44416a16e42c6ef79edc8f36
Author: sethah <[email protected]>
Date: 2017-06-14T18:26:49Z
style checker
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]