Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/18732#discussion_r142584934
--- Diff: python/pyspark/sql/group.py ---
@@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None):
jgd = self._jgd.pivot(pivot_col)
else:
jgd = self._jgd.pivot(pivot_col, values)
- return GroupedData(jgd, self.sql_ctx)
+ return GroupedData(jgd, self)
+
+ def apply(self, udf):
+ """
+ Maps each group of the current :class:`DataFrame` using a pandas
udf and returns the result
+ as a :class:`DataFrame`.
+
+ The user-function should take a `pandas.DataFrame` and return
another `pandas.DataFrame`.
--- End diff --
little nit: `user-function` -> `user-defined function`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]