Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21699#discussion_r208479881
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -371,25 +426,14 @@ class RelationalGroupedDataset protected[sql](
/**
* (Java-specific) Pivots a column of the current `DataFrame` and
performs the specified
- * aggregation.
- *
- * There are two versions of pivot function: one that requires the
caller to specify the list
- * of distinct values to pivot on, and one that does not. The latter is
more concise but less
- * efficient, because Spark needs to first compute the list of distinct
values internally.
+ * aggregation. This is an overloaded version of the `pivot` method with
`pivotColumn` of
+ * the `String` type.
*
- * {{{
- * // Compute the sum of earnings for each year by course with each
course as a separate column
- * df.groupBy("year").pivot("course", Arrays.<Object>asList("dotNET",
"Java")).sum("earnings");
- *
- * // Or without specifying column values (less efficient)
- * df.groupBy("year").pivot("course").sum("earnings");
- * }}}
- *
- * @param pivotColumn Name of the column to pivot.
+ * @param pivotColumn the column to pivot.
* @param values List of values that will be translated to columns in
the output DataFrame.
- * @since 1.6.0
+ * @since 2.4.0
*/
- def pivot(pivotColumn: String, values: java.util.List[Any]):
RelationalGroupedDataset = {
+ def pivot(pivotColumn: Column, values: java.util.List[Any]):
RelationalGroupedDataset = {
--- End diff --
If there's a plan for auditing it in 3.0.0, I am okay with going ahead with
`Column` but thing is, we should deprecate them first.
For the current status, I think the problem here is, this is an overloaded
version of pivot and wouldn't necessarily make the differences.
I used `pivot` heavily in the previous company before and I am pretty sure
`pivot(String, actual values)` has been a common pattern and usage so far.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]