GitHub user sethah opened a pull request:
https://github.com/apache/spark/pull/15488
[SPARK-17941][ML][TEST] Logistic regression tests should use sample weights.
## What changes were proposed in this pull request?
The sample weight testing for logistic regressions is not robust. Logistic
regression suite already has many test cases comparing results to R glmnet.
Since both libraries support sample weights, we should use sample weights in
the test to increase coverage for sample weighting. This patch doesn't really
add any code and makes the testing more complete.
Also fixed some errors with the R code that was referenced in the test
suit. Changed `standardization=T` to `standardize=T` since the former is
invalid.
## How was this patch tested?
Existing unit tests are modified. No non-test code is touched.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sethah/spark logreg_weight_tests
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15488.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15488
----
commit 2b7d99741663728d5289ec830bddb8f1a94c9a8a
Author: sethah <[email protected]>
Date: 2016-10-13T22:51:57Z
binary is updated
commit e523f414396ce0d773f4281ee391cc0d8da82593
Author: sethah <[email protected]>
Date: 2016-10-13T23:32:15Z
all tests updated and passing
commit e28ad4391d64e1901434418b37c0dc837287cdfa
Author: sethah <[email protected]>
Date: 2016-10-14T15:21:10Z
strong l1 tests
commit b4a158fa5fcc47057537ee67e9cbfb5be3a819e3
Author: sethah <[email protected]>
Date: 2016-10-14T15:48:36Z
comment formatting
commit 54bb1b5cfd3d0b9914f7667e7b6daca020277158
Author: sethah <[email protected]>
Date: 2016-10-14T15:57:46Z
style error
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]