GitHub user yinxusen opened a pull request:

    https://github.com/apache/spark/pull/13794

    [SPARK-15574][ML][PySpark] Python transformer wrapper and Pipeline

    ## What changes were proposed in this pull request?
    
    1. Add a PythonTransformerWrapper in Scala for pure Python implemented 
transformers in PySpark.
    
    2. Change pure Python implemented Pipeline into Java Object based one.
    
    3. Implement save/load in Pipeline for pure Python transformers.
    
    ## How was this patch tested?
    
    Test with Python unit test and doc test.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yinxusen/spark SPARK-15574

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13794.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13794
    
----
commit 5eb0245ce579c09144d636b93cbb4b54dbe2560f
Author: yinxusen <[email protected]>
Date:   2016-06-07T22:27:13Z

    add fake transformer with pure Python

commit 06a654535b0003fa999d2fa8b7476b4e727e37bd
Author: yinxusen <[email protected]>
Date:   2016-06-08T02:49:07Z

    add an action in pure python transformer

commit 70c09453dcbc29a16635cb00b82c2d0871b5d748
Author: yinxusen <[email protected]>
Date:   2016-06-09T23:04:36Z

    Merge branch 'master' into SPARK-15574

commit 73cbcc74593d006fd80b15f90ebd053902b3a0a4
Author: yinxusen <[email protected]>
Date:   2016-06-10T22:00:16Z

    Merge branch 'master' into SPARK-15574

commit 4418ae6a49d391f4d26615b7ce36b5055d0d3c56
Author: yinxusen <[email protected]>
Date:   2016-06-10T22:10:10Z

    add transformer wrapper

commit 72787fe6787b08baa000b61c412de972bbff0a4c
Author: yinxusen <[email protected]>
Date:   2016-06-13T21:42:00Z

    update python transformer

commit f5094adc46077c1ec9aa81583a7157a94dd2d1c5
Author: yinxusen <[email protected]>
Date:   2016-06-13T22:23:10Z

    add python transformer wrapper

commit 9e2632e548e59fa3c6fc642d5fa3dfb7033df209
Author: yinxusen <[email protected]>
Date:   2016-06-13T22:41:16Z

    add active spark context

commit 8cff5bcffa342f51364b7d80fece132d863c18df
Author: yinxusen <[email protected]>
Date:   2016-06-13T23:12:13Z

    split uid

commit 6bde27fdd829a35a607c322dc967cc4cf5a52bb5
Author: yinxusen <[email protected]>
Date:   2016-06-13T23:13:03Z

    fix more

commit 5fdeac88d35254be1bec405ead9005a0e55d7222
Author: yinxusen <[email protected]>
Date:   2016-06-13T23:55:16Z

    start callback server

commit 32cf6e4601e37db3bf96b386ab9496922ff2f679
Author: yinxusen <[email protected]>
Date:   2016-06-13T23:58:28Z

    fix order

commit 2079a031a5b28ebc914a8be75728d3d07ad67e18
Author: yinxusen <[email protected]>
Date:   2016-06-14T00:04:14Z

    fix order

commit e27da4737535fa04361ab5b123f17b93ca403822
Author: yinxusen <[email protected]>
Date:   2016-06-14T23:32:07Z

    Merge branch 'master' into SPARK-15574

commit d499d0e0d34464afc638ab5be1d8ebefed4e8177
Author: yinxusen <[email protected]>
Date:   2016-06-15T00:03:13Z

    add pipeline as prototype

commit 371485a9e9f4210845b0dc18e9d412c396f1a0fa
Author: yinxusen <[email protected]>
Date:   2016-06-15T00:35:37Z

    add support for pure python transformer

commit 0e026e53b7abaf7c392808e3de963430b9af0864
Author: yinxusen <[email protected]>
Date:   2016-06-15T02:03:05Z

    fix error of classmethod

commit 3655ab38a3d2bdc29087b546060a4a8d667d08a1
Author: yinxusen <[email protected]>
Date:   2016-06-15T02:18:00Z

    add pure_pipeline as test

commit 7c7f6843ae9dbb90645e487ae83898226d63f8ea
Author: yinxusen <[email protected]>
Date:   2016-06-15T02:36:16Z

    add debug info

commit 96f736113bf1ba43c3edd6ca6e14eb107716a29f
Author: yinxusen <[email protected]>
Date:   2016-06-15T21:02:59Z

    add transformSchema

commit 954bcb8f7bd4ddfc9cf9f48e05a2bb632e92594a
Author: yinxusen <[email protected]>
Date:   2016-06-15T21:12:49Z

    add docstring

commit c41d4dcdbc4c88b8a4eff4f499fa03e338366667
Author: yinxusen <[email protected]>
Date:   2016-06-15T21:38:15Z

    convert json to string

commit c289be4324f2039875f30c737370092ea8aac306
Author: yinxusen <[email protected]>
Date:   2016-06-15T22:31:33Z

    change API

commit 72facb2390fef8219ad712caaff6ef44b8bc5529
Author: yinxusen <[email protected]>
Date:   2016-06-15T22:53:47Z

    add another pure python transformer

commit 57a9bd1e16c5dcfc07f8c4889e8519b9a251f7e9
Author: yinxusen <[email protected]>
Date:   2016-06-15T22:59:40Z

    add to pipeline

commit 353650365ec5062e98bcc766262d712228757608
Author: yinxusen <[email protected]>
Date:   2016-06-15T23:10:56Z

    fix fromJava

commit fe42e4f731c84e4e3ecceebc8fa040b54134b108
Author: yinxusen <[email protected]>
Date:   2016-06-15T23:48:14Z

    fix bugs

commit 9f779005aa9bfda16d1ec9e24c0a07c5c006893f
Author: yinxusen <[email protected]>
Date:   2016-06-16T00:10:30Z

    add getTransformer

commit 0047cf24c7d80c2815f3089229a38dc0bce39b58
Author: yinxusen <[email protected]>
Date:   2016-06-16T00:46:18Z

    add method into java wrapper

commit 752570d76fc34efa20a125536dd3329f0759a7d0
Author: yinxusen <[email protected]>
Date:   2016-06-16T22:03:51Z

    add ser/de for transformer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to