Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20174#discussion_r160040256
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
    @@ -666,4 +666,16 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
           assert(exchangePlans.length == 1)
         }
       }
    +
    +  test("SPARK-22951: aggregation on empty data frame should only return 
initial values") {
    +    // non code gen
    +    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
    +      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    +    }
    +
    +    // code gen
    +    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
    +      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    +    }
    +  }
    --- End diff --
    
    ```
    Seq("true", "false").foreach { codegen =>
        withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> codegen) {
          assert(spark.emptyDataFrame.dropDuplicates.count == 0)
        }
    }
    ```
    BTW, I think it is common patterns to check codegen and non-codegen paths, 
so we might be better to add a helper function in test utility class like;
    ```
    checkExecution {
      assert(spark.emptyDataFrame.dropDuplicates.count == 0)
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to