Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/18732#discussion_r143802019
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -435,6 +435,35 @@ class RelationalGroupedDataset protected[sql](
df.logicalPlan.output,
df.logicalPlan))
}
+
+ /**
+ * Applies a vectorized python user-defined function to each group of
data.
+ * The user-defined function defines a transformation:
`Pandas.DataFrame` -> `Pandas.DataFrame`.
+ * For each group, all elements in the group are passed as a
`Pandas.DataFrame` and the results
+ * for all groups are combined into a new `DataFrame`.
+ *
+ * This function does not support partial aggregation, and requires
shuffling all the data in
+ * the `DataFrame`.
--- End diff --
I believe for scaladoc the Spark DataFrame should be enclosed in brackets
-> [[DataFrame]]
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]