GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/8115
[SPARK-9847] [ML] Modified copyValues to distinguish between default,
explicit param values
From JIRA: Currently, Params.copyValues copies default parameter values to
the paramMap of the target instance, rather than the defaultParamMap. It should
copy to the defaultParamMap because explicitly setting a parameter can change
the semantics.
This issue arose in SPARK-9789, where 2 params "threshold" and "thresholds"
for LogisticRegression can have mutually exclusive values. If thresholds is
set, then fit() will copy the default value of threshold as well, easily
resulting in inconsistent settings for the 2 params.
CC: @mengxr
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jkbradley/spark copyvalues-fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8115.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8115
----
commit bece2a30d1fa5511707196898876f40381f2fdb1
Author: Joseph K. Bradley <[email protected]>
Date: 2015-08-11T22:33:59Z
modified copyValues to distinguish between default, explicit param values
commit 1f1ef6c825bc7228a3050a1ea9322c9b4a554ef9
Author: Joseph K. Bradley <[email protected]>
Date: 2015-08-11T22:42:40Z
added unit test for copyValues
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]