GitHub user BryanCutler opened a pull request:

    https://github.com/apache/spark/pull/11906

    [SPARK-14087][PySpark][ML] [WIP] PySpark JavaModel Param ownership error

    When a PySpark model is created after fitting data, its UID is initialized 
to the parent estimator's value. Before this assignment, any params defined in 
the model are copied from the object to the class in Params._copy_params() and 
assigned a different parent UID. This causes PySpark to think the params are 
not owned by the model and can lead to a ValueError raised from 
Params._shouldOwn
    
    *** Still thinking of a possible test for this

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BryanCutler/spark pyspark-init-uid-SPARK-14087

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11906.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11906
    
----
commit 276dc77c7c35769c2b65fabc3fb83142d2765fd4
Author: Bryan Cutler <[email protected]>
Date:   2016-03-23T00:16:19Z

    [SPARK-14087] Allow JavaModel to set an initial UID for when it has a 
parent estimator

commit 518f2e54bb18455f5f63255bceb7bce607504732
Author: Bryan Cutler <[email protected]>
Date:   2016-03-23T00:20:21Z

    fix lint error

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to