Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21699#discussion_r199742580
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
    @@ -340,36 +340,52 @@ class RelationalGroupedDataset protected[sql](
     
       /**
        * Pivots a column of the current `DataFrame` and performs the specified 
aggregation.
    -   * There are two versions of pivot function: one that requires the 
caller to specify the list
    -   * of distinct values to pivot on, and one that does not. The latter is 
more concise but less
    -   * efficient, because Spark needs to first compute the list of distinct 
values internally.
        *
        * {{{
        *   // Compute the sum of earnings for each year by course with each 
course as a separate column
    -   *   df.groupBy("year").pivot("course", Seq("dotNET", 
"Java")).sum("earnings")
    -   *
    -   *   // Or without specifying column values (less efficient)
    -   *   df.groupBy("year").pivot("course").sum("earnings")
    +   *   df.groupBy($"year").pivot($"course", Seq("dotNET", 
"Java")).sum($"earnings")
        * }}}
        *
    -   * @param pivotColumn Name of the column to pivot.
    +   * @param pivotColumn the column to pivot.
        * @param values List of values that will be translated to columns in 
the output DataFrame.
    -   * @since 1.6.0
    +   * @since 2.4.0
        */
    -  def pivot(pivotColumn: String, values: Seq[Any]): 
RelationalGroupedDataset = {
    +  def pivot(pivotColumn: Column, values: Seq[Any]): 
RelationalGroupedDataset = {
    --- End diff --
    
    To make diffs smaller, can you move this under the signature `def 
pivot(pivotColumn: String, values: Seq[Any])`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to