GitHub user icexelloss opened a pull request:
https://github.com/apache/spark/pull/20142
[SPARK-22930][PYTHON][SQL] Improve the description of Vectorized UDFs for
non-deterministic cases
## What changes were proposed in this pull request?
Add tests for using non deterministic UDFs in aggregate.
Update pandas_udf docstring w.r.t to determinism.
## How was this patch tested?
test_nondeterministic_udf_in_aggregate
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/icexelloss/spark
SPARK-22930-pandas-udf-deterministic
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20142.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20142
----
commit 1f4183fe12c5d1d2f43cefb7909f4f5cd423ee72
Author: Li Jin <ice.xelloss@...>
Date: 2018-01-03T21:57:05Z
Add test for using non deterministic udf in aggregate; Fix docstring of
pandas_udf w.r.t determinism
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]