Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22597#discussion_r225295336
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
 ---
    @@ -383,4 +385,17 @@ class OrcFilterSuite extends OrcTest with 
SharedSQLContext {
           )).get.toString
         }
       }
    +
    +  test("SPARK-25579 ORC PPD should support column names with dot") {
    +    import testImplicits._
    +
    +    withSQLConf(SQLConf.ORC_FILTER_PUSHDOWN_ENABLED.key -> "true") {
    +      withTempDir { dir =>
    +        val path = new File(dir, "orc").getCanonicalPath
    +        Seq((1, 2), (3, 4)).toDF("col.dot.1", "col.dot.2").write.orc(path)
    --- End diff --
    
    We are using the default parallelism from `TestSparkSession` on two rows 
and it generates [separate output 
files](https://github.com/apache/spark/pull/22597#discussion_r225004937) 
already.
    
    If you are concerning some possibility of flakiness, we are able to 
increase the number of rows to `10` and call `repartition(10)` and check 
`assert(actual < 10)` as you did 
[before](https://github.com/apache/spark/blob/5d726b865948f993911fd5b9730b25cfa94e16c7/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala#L1016-L1040).
 Do you want that?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to