GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/11906
[SPARK-14087][PySpark][ML] [WIP] PySpark JavaModel Param ownership error
When a PySpark model is created after fitting data, its UID is initialized
to the parent estimator's value. Before this assignment, any params defined in
the model are copied from the object to the class in Params._copy_params() and
assigned a different parent UID. This causes PySpark to think the params are
not owned by the model and can lead to a ValueError raised from
Params._shouldOwn
*** Still thinking of a possible test for this
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BryanCutler/spark pyspark-init-uid-SPARK-14087
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11906.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11906
----
commit 276dc77c7c35769c2b65fabc3fb83142d2765fd4
Author: Bryan Cutler <[email protected]>
Date: 2016-03-23T00:16:19Z
[SPARK-14087] Allow JavaModel to set an initial UID for when it has a
parent estimator
commit 518f2e54bb18455f5f63255bceb7bce607504732
Author: Bryan Cutler <[email protected]>
Date: 2016-03-23T00:20:21Z
fix lint error
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]