GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/6019

    [WIP][SPARK-7407][MLLIB] use uid + name to identify parameters

    A param instance is strongly attached to an parent in the current 
implementation. So if we make a copy of an estimator or a transformer in 
pipelines and other meta-algorithms, it becomes error-prone to copy the params 
to the copied instances. In this PR, a param is identified by its parent's UID 
and the param name. So it becomes loosely attached to its parent and all its 
derivatives. The UID is preserved during copying or fitting. All components now 
have a default constructor and a constructor that takes a UID as input. I keep 
the constructors for Param in this PR to reduce the amount of diff and moved 
`parent` as a mutable field. @jkbradley
    
    This PR still needs some clean-ups, and there are several spark.ml PRs 
pending. I'll try to get them merged first and then update this PR.
    
    @jkbradley

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-7407

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/6019.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6019
    
----
commit eaeed35ba0ee22132afd88e51d8808bb8defb122
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-07T04:08:23Z

    update Identifiable

commit 8726d39d3a6ff3a4285b80661b2cf4c48b830508
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-07T04:39:40Z

    use parent uid in Param

commit 108937eb5501801387137b15ec8d7003d4d717b5
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-07T20:16:01Z

    pass compile

commit fbc39f04dd44897e320cc283b0a0cfa9376f2494
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-07T22:34:02Z

    pass test:compile

commit e1160cfceb249db8071181620871a25f7a910a91
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-07T22:45:11Z

    fix tests

commit 818e1db0375f3230eec2fcc231cede7f2bb8f13d
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-08T21:08:11Z

    merge master

commit c255f17ee3e9e26f751973b6113dc91cfc94defd
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-08T21:34:37Z

    fix tests in ParamsSuite

commit fdbc415bb9e2306df37c215d587ac57f8418b791
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-08T21:59:56Z

    all tests passed

commit a4794dd842f82b001daef73dd82016766b6215b9
Author: Xiangrui Meng <[email protected]>
Date:   2015-05-08T22:13:03Z

    change Param to use  to reduce the size of diff

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to