GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/19517
[SPARK-20396][SQL][PySpark][FOLLOW-UP] groupby().apply() with pandas udf
## What changes were proposed in this pull request?
This is a follow-up of #18732.
This pr modifies `GroupedData.apply()` method to convert pandas udf to
grouped udf implicitly.
## How was this patch tested?
Exisiting tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache-spark issues/SPARK-20396/fup2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19517.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19517
----
commit 4d2bd959e1eeabb4f72cfbb52a374ce721030507
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T06:45:55Z
Introduce `@pandas_grouped_udf` decorator for grouped vectorized UDF.
commit f0968702038e11c9c9a8f305c61f72d3f9e00f9a
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T08:03:30Z
Use PythonUdfType instead of vectorized and grouped.
commit 639af2cee77456271d5f2f536d4712ab8e01a89d
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T13:42:58Z
Update an error message.
commit 10512a64a9560eee6d3f65802abd042dedf0cafb
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T13:43:51Z
Add a test to use data type string.
commit 789e642763ab4f59e14137fcc75b514223bc7aae
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T14:13:43Z
Restrict the number of arguments for grouped udf to only 1.
commit 122a7bccaff11def2c12cfccdd00244394ed3478
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T16:24:03Z
Restrict checking the number of arguments.
commit fdafb3561d44ca2583380b7aeaf7843ce5285b1e
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T16:54:23Z
Revert "Restrict checking the number of arguments."
This reverts commit 122a7bccaff11def2c12cfccdd00244394ed3478.
commit 94d05f4f8d5c663319ec12668dbd1206ffa2e83a
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T18:10:50Z
Address comments.
commit 733296951b45d760aa0a8465eb0189077ea67372
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-16T18:33:08Z
Add tests for unsupported type.
commit 85f250d0eda56606a599c5fb15046ef0fd63a3c4
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-17T04:59:34Z
Address a comment.
commit 7b386c4be48c0a2e8de6f04cf341de13e8e98444
Author: Takuya UESHIN <[email protected]>
Date: 2017-10-17T14:12:37Z
Remove `@pandas_grouped_udf` and convert implicitly.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]