Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21699#discussion_r208479881 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -371,25 +426,14 @@ class RelationalGroupedDataset protected[sql]( /** * (Java-specific) Pivots a column of the current `DataFrame` and performs the specified - * aggregation. - * - * There are two versions of pivot function: one that requires the caller to specify the list - * of distinct values to pivot on, and one that does not. The latter is more concise but less - * efficient, because Spark needs to first compute the list of distinct values internally. + * aggregation. This is an overloaded version of the `pivot` method with `pivotColumn` of + * the `String` type. * - * {{{ - * // Compute the sum of earnings for each year by course with each course as a separate column - * df.groupBy("year").pivot("course", Arrays.<Object>asList("dotNET", "Java")).sum("earnings"); - * - * // Or without specifying column values (less efficient) - * df.groupBy("year").pivot("course").sum("earnings"); - * }}} - * - * @param pivotColumn Name of the column to pivot. + * @param pivotColumn the column to pivot. * @param values List of values that will be translated to columns in the output DataFrame. - * @since 1.6.0 + * @since 2.4.0 */ - def pivot(pivotColumn: String, values: java.util.List[Any]): RelationalGroupedDataset = { + def pivot(pivotColumn: Column, values: java.util.List[Any]): RelationalGroupedDataset = { --- End diff -- If there's a plan for auditing it in 3.0.0, I am okay with going ahead with `Column` but thing is, we should deprecate them first. For the current status, I think the problem here is, this is an overloaded version of pivot and wouldn't necessarily make the differences. I used `pivot` heavily in the previous company before and I am pretty sure `pivot(String, actual values)` has been a common pattern and usage so far.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org