Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/20908#discussion_r178374957
--- Diff: python/pyspark/sql/functions.py ---
@@ -2208,7 +2208,8 @@ def pandas_udf(f=None, returnType=None,
functionType=None):
1. SCALAR
A scalar UDF defines a transformation: One or more `pandas.Series`
-> A `pandas.Series`.
- The returnType should be a primitive data type, e.g.,
:class:`DoubleType`.
+ The returnType should be a primitive data type, e.g.,
:class:`DoubleType` or
+ arrays of a primitive data type (e.g. :class:`ArrayType`).
--- End diff --
Checked nested arrays do not currently work. I'll an explicit test to check
that this fails, and when the test starts passing we can update the
documentation.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]