GitHub user imatiach-msft opened a pull request:
https://github.com/apache/spark/pull/16557
[SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use weight column
## What changes were proposed in this pull request?
The evaluators BinaryClassificationEvaluator, RegressionEvaluator, and
MulticlassClassificationEvaluator and the corresponding metrics classes
BinaryClassificationMetrics, RegressionMetrics and MulticlassMetrics should use
sample weight data.
The updates to the regression metrics were based on (and updated with new
changes based on comments):
https://issues.apache.org/jira/browse/SPARK-11520
("RegressionMetrics should support instance weights")
but the pull request was closed as the changes were never checked in.
## How was this patch tested?
This is still a work in progress, I will be adding more tests soon. I took
the regression tests from:
https://github.com/apache/spark/pull/9907
Which was closed as a stale PR but I updated it with some changes.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/imatiach-msft/spark
ilmat/evaluate-with-weights
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16557.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16557
----
commit 2b624c00e8d57fd32c08e7b5af52606dd5d8d6b5
Author: Ilya Matiach <[email protected]>
Date: 2016-12-30T15:39:36Z
[SPARK-18693][ML] Evaluators should use weight column
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]