Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r162886239
--- Diff: python/pyspark/sql/functions.py ---
@@ -2221,6 +2223,35 @@ def pandas_udf(f=None, returnType=None,
functionType=None):
.. seealso:: :meth:`pyspark.sql.GroupedData.apply`
+ 3. GROUP_AGG
+
+ A group aggregate UDF defines a transformation: One or more
`pandas.Series` -> A scalar
+ The `returnType` should be a primitive data type, e.g,
:class:`DoubleType`.
--- End diff --
very small nit: `e.g.` instead of `e.g`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]