GitHub user yinxusen opened a pull request:
https://github.com/apache/spark/pull/13794
[SPARK-15574][ML][PySpark] Python transformer wrapper and Pipeline
## What changes were proposed in this pull request?
1. Add a PythonTransformerWrapper in Scala for pure Python implemented
transformers in PySpark.
2. Change pure Python implemented Pipeline into Java Object based one.
3. Implement save/load in Pipeline for pure Python transformers.
## How was this patch tested?
Test with Python unit test and doc test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yinxusen/spark SPARK-15574
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/13794.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #13794
----
commit 5eb0245ce579c09144d636b93cbb4b54dbe2560f
Author: yinxusen <[email protected]>
Date: 2016-06-07T22:27:13Z
add fake transformer with pure Python
commit 06a654535b0003fa999d2fa8b7476b4e727e37bd
Author: yinxusen <[email protected]>
Date: 2016-06-08T02:49:07Z
add an action in pure python transformer
commit 70c09453dcbc29a16635cb00b82c2d0871b5d748
Author: yinxusen <[email protected]>
Date: 2016-06-09T23:04:36Z
Merge branch 'master' into SPARK-15574
commit 73cbcc74593d006fd80b15f90ebd053902b3a0a4
Author: yinxusen <[email protected]>
Date: 2016-06-10T22:00:16Z
Merge branch 'master' into SPARK-15574
commit 4418ae6a49d391f4d26615b7ce36b5055d0d3c56
Author: yinxusen <[email protected]>
Date: 2016-06-10T22:10:10Z
add transformer wrapper
commit 72787fe6787b08baa000b61c412de972bbff0a4c
Author: yinxusen <[email protected]>
Date: 2016-06-13T21:42:00Z
update python transformer
commit f5094adc46077c1ec9aa81583a7157a94dd2d1c5
Author: yinxusen <[email protected]>
Date: 2016-06-13T22:23:10Z
add python transformer wrapper
commit 9e2632e548e59fa3c6fc642d5fa3dfb7033df209
Author: yinxusen <[email protected]>
Date: 2016-06-13T22:41:16Z
add active spark context
commit 8cff5bcffa342f51364b7d80fece132d863c18df
Author: yinxusen <[email protected]>
Date: 2016-06-13T23:12:13Z
split uid
commit 6bde27fdd829a35a607c322dc967cc4cf5a52bb5
Author: yinxusen <[email protected]>
Date: 2016-06-13T23:13:03Z
fix more
commit 5fdeac88d35254be1bec405ead9005a0e55d7222
Author: yinxusen <[email protected]>
Date: 2016-06-13T23:55:16Z
start callback server
commit 32cf6e4601e37db3bf96b386ab9496922ff2f679
Author: yinxusen <[email protected]>
Date: 2016-06-13T23:58:28Z
fix order
commit 2079a031a5b28ebc914a8be75728d3d07ad67e18
Author: yinxusen <[email protected]>
Date: 2016-06-14T00:04:14Z
fix order
commit e27da4737535fa04361ab5b123f17b93ca403822
Author: yinxusen <[email protected]>
Date: 2016-06-14T23:32:07Z
Merge branch 'master' into SPARK-15574
commit d499d0e0d34464afc638ab5be1d8ebefed4e8177
Author: yinxusen <[email protected]>
Date: 2016-06-15T00:03:13Z
add pipeline as prototype
commit 371485a9e9f4210845b0dc18e9d412c396f1a0fa
Author: yinxusen <[email protected]>
Date: 2016-06-15T00:35:37Z
add support for pure python transformer
commit 0e026e53b7abaf7c392808e3de963430b9af0864
Author: yinxusen <[email protected]>
Date: 2016-06-15T02:03:05Z
fix error of classmethod
commit 3655ab38a3d2bdc29087b546060a4a8d667d08a1
Author: yinxusen <[email protected]>
Date: 2016-06-15T02:18:00Z
add pure_pipeline as test
commit 7c7f6843ae9dbb90645e487ae83898226d63f8ea
Author: yinxusen <[email protected]>
Date: 2016-06-15T02:36:16Z
add debug info
commit 96f736113bf1ba43c3edd6ca6e14eb107716a29f
Author: yinxusen <[email protected]>
Date: 2016-06-15T21:02:59Z
add transformSchema
commit 954bcb8f7bd4ddfc9cf9f48e05a2bb632e92594a
Author: yinxusen <[email protected]>
Date: 2016-06-15T21:12:49Z
add docstring
commit c41d4dcdbc4c88b8a4eff4f499fa03e338366667
Author: yinxusen <[email protected]>
Date: 2016-06-15T21:38:15Z
convert json to string
commit c289be4324f2039875f30c737370092ea8aac306
Author: yinxusen <[email protected]>
Date: 2016-06-15T22:31:33Z
change API
commit 72facb2390fef8219ad712caaff6ef44b8bc5529
Author: yinxusen <[email protected]>
Date: 2016-06-15T22:53:47Z
add another pure python transformer
commit 57a9bd1e16c5dcfc07f8c4889e8519b9a251f7e9
Author: yinxusen <[email protected]>
Date: 2016-06-15T22:59:40Z
add to pipeline
commit 353650365ec5062e98bcc766262d712228757608
Author: yinxusen <[email protected]>
Date: 2016-06-15T23:10:56Z
fix fromJava
commit fe42e4f731c84e4e3ecceebc8fa040b54134b108
Author: yinxusen <[email protected]>
Date: 2016-06-15T23:48:14Z
fix bugs
commit 9f779005aa9bfda16d1ec9e24c0a07c5c006893f
Author: yinxusen <[email protected]>
Date: 2016-06-16T00:10:30Z
add getTransformer
commit 0047cf24c7d80c2815f3089229a38dc0bce39b58
Author: yinxusen <[email protected]>
Date: 2016-06-16T00:46:18Z
add method into java wrapper
commit 752570d76fc34efa20a125536dd3329f0759a7d0
Author: yinxusen <[email protected]>
Date: 2016-06-16T22:03:51Z
add ser/de for transformer
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]