[GitHub] spark pull request #21699: [SPARK-24722][SQL] pivot() with Column type argum...

HyukjinKwon Wed, 01 Aug 2018 19:52:33 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21699#discussion_r207088444
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
    @@ -339,29 +400,30 @@ class RelationalGroupedDataset protected[sql](
     
       /**
        * Pivots a column of the current `DataFrame` and performs the specified 
aggregation.
    -   * There are two versions of pivot function: one that requires the 
caller to specify the list
    -   * of distinct values to pivot on, and one that does not. The latter is 
more concise but less
    -   * efficient, because Spark needs to first compute the list of distinct 
values internally.
    +   * This is an overloaded version of the `pivot` method with 
`pivotColumn` of the `String` type.
        *
        * {{{
        *   // Compute the sum of earnings for each year by course with each 
course as a separate column
    -   *   df.groupBy("year").pivot("course", Seq("dotNET", 
"Java")).sum("earnings")
    -   *
    -   *   // Or without specifying column values (less efficient)
    -   *   df.groupBy("year").pivot("course").sum("earnings")
    +   *   df.groupBy($"year").pivot($"course", Seq("dotNET", 
"Java")).sum($"earnings")
        * }}}
        *
    -   * @param pivotColumn Name of the column to pivot.
    +   * @param pivotColumn the column to pivot.
        * @param values List of values that will be translated to columns in 
the output DataFrame.
    -   * @since 1.6.0
    +   * @since 2.4.0
        */
    -  def pivot(pivotColumn: String, values: Seq[Any]): 
RelationalGroupedDataset = {
    +  def pivot(pivotColumn: Column, values: Seq[Any]): 
RelationalGroupedDataset = {
    +    import org.apache.spark.sql.functions.struct
         groupType match {
           case RelationalGroupedDataset.GroupByType =>
    +        val pivotValues = values.map {
    --- End diff --
    
    I hope the last commit is reverted and we go ahead orthogonally if 
@maryannxue is happy with that too.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21699: [SPARK-24722][SQL] pivot() with Column type argum...

Reply via email to