[GitHub] spark pull request #21611: [SPARK-24569][SQL] Aggregator with output type Op...

viirya Wed, 27 Jun 2018 07:11:42 -0700

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21611#discussion_r198509965
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
    @@ -333,4 +406,28 @@ class DatasetAggregatorSuite extends QueryTest with 
SharedSQLContext {
           df.groupBy($"i").agg(VeryComplexResultAgg.toColumn),
           Row(1, Row(Row(1, "a"), Row(1, "a"))) :: Row(2, Row(Row(2, "bc"), 
Row(2, "bc"))) :: Nil)
       }
    +
    +  test("SPARK-24569: Aggregator with output type Option[Boolean] creates 
column of type Row") {
    +    val df = Seq(
    +      OptionBooleanData("bob", Some(true)),
    +      OptionBooleanData("bob", Some(false)),
    +      OptionBooleanData("bob", None)).toDF()
    +    val group = df
    +      .groupBy("name")
    --- End diff --
    
    Yes, if you use similar `Aggregator` with `groupByKey`, you gets a struct 
too:
    
    ```scala
    val df = Seq(
      OptionBooleanData("bob", Some(true)),
      OptionBooleanData("bob", Some(false)),
      OptionBooleanData("bob", None)).toDF()
    val df2 = df.groupByKey((r: Row) => r.getString(0))
      .agg(OptionBooleanAggregator("isGood").toColumn)
    df2.printSchema
    ```
    ```
    root
     |-- value: string (nullable = true)
     |-- OptionBooleanAggregator(org.apache.spark.sql.Row): struct (nullable = 
true)
     |    |-- value: boolean (nullable = true)
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21611: [SPARK-24569][SQL] Aggregator with output type Op...

Reply via email to