Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r151042155
--- Diff: python/pyspark/sql/functions.py ---
@@ -2247,16 +2142,20 @@ def pandas_udf(f=None, returnType=StringType()):
| 8| JOHN DOE| 22|
+----------+--------------+------------+
- 2. A `pandas.DataFrame` -> A `pandas.DataFrame`
+ 2. GROUP_MAP
- This udf is only used with :meth:`pyspark.sql.GroupedData.apply`.
+ A group map UDF defines transformation: A `pandas.DataFrame` -> A
`pandas.DataFrame`
The returnType should be a :class:`StructType` describing the
schema of the returned
`pandas.DataFrame`.
+ The length of the returned `pandas.DataFrame` can arbitrary.
--- End diff --
nit: `can arbitrary` -> `can be arbitrary`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]