GitHub user icexelloss opened a pull request:
https://github.com/apache/spark/pull/21650
[SPARK-24624] Support mixture of Python UDF and Scalar Pandas UDF
## What changes were proposed in this pull request?
This PR add supports for using mixed Python UDF and Scalar Pandas UDF, in
the following two cases:
(1)
```
f1 = udf(lambda x: x + 1, 'int')
f2 = pandas_udf(lambda x: x + 2, 'int')
df = ...
df = df.withColumn('foo', f1(df['v']))
df = df.withColumn('bar', f2(df['v']))
```
(2)
```
f1 = udf(lambda x: x + 1, 'int')
f2 = pandas_udf(lambda x: x + 2, 'int')
df = ...
df = df.withColumn('foo', f2(f1(df['v'])))
```
## How was this patch tested?
New tests are added to BatchEvalPythonExecSuite and ScalarPandasUDFTests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/icexelloss/spark SPARK-24624-mix-udf
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21650.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21650
----
commit 48ae822bcdf6df40b181f86379d275d602c580c9
Author: Li Jin <ice.xelloss@...>
Date: 2018-06-22T18:35:34Z
wip
commit 68e665ec981c1a7cae46398bc2ea8a4880e95331
Author: Li Jin <ice.xelloss@...>
Date: 2018-06-27T22:31:25Z
Test passes
commit 6b47b69305257e9ee9f5135968913a4f92731ef5
Author: Li Jin <ice.xelloss@...>
Date: 2018-06-27T22:34:28Z
Remove white spaces
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]