GitHub user thunterdb opened a pull request:
https://github.com/apache/spark/pull/15002
[ML][SPARK-17439] Fixing compression issues with approximate quantiles and
adding more tests
## What changes were proposed in this pull request?
This PR build on #14976 and fixes a correctness bug that would cause the
wrong quantile to be returned for small target errors.
## How was this patch tested?
This PR adds 8 unit tests that were failing without the fix.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/thunterdb/spark ml-1783
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15002.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15002
----
commit 75cb0887deb9e5d27b7d6e5fa1129df4a953c641
Author: Sean Owen <[email protected]>
Date: 2016-09-06T13:53:49Z
Actually call compress() in QuantileSummaries, and avoid expensive
ArrayBuffer.prepend
commit 86afd440f04984f6413da70b5a53322e0167d22c
Author: Timothy Hunter <[email protected]>
Date: 2016-09-07T21:04:59Z
work
commit 406943c67f445e7eb672b88475c9dc219a564d85
Author: Timothy Hunter <[email protected]>
Date: 2016-09-07T21:10:25Z
merging with @srowen's PR
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]