[GitHub] spark pull request #20525: SPARK-23271 Parquet output contains only _SUCCESS...

gatorsmile Tue, 06 Feb 2018 23:58:00 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20525#discussion_r166540368
  
    --- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileFormatWriterSuite.scala
 ---
    @@ -32,6 +33,24 @@ class FileFormatWriterSuite extends QueryTest with 
SharedSQLContext {
         }
       }
     
    +  test("SPARK-23271 empty dataframe when saved in parquet should write a 
metadata only file") {
    +    withTempDir { inputPath =>
    +      withTempPath { outputPath =>
    +        val anySchema = StructType(StructField("anyName", StringType) :: 
Nil)
    +        val df = spark.read.schema(anySchema).csv(inputPath.toString)
    +        df.write.parquet(outputPath.toString)
    +        val partFiles = outputPath.listFiles()
    +          .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
    +        assert(partFiles.length === 1)
    +
    +        // Now read the file.
    +        val  df1 = spark.read.parquet(outputPath.toString)
    --- End diff --
    
    Nit: extra space before `df1`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20525: SPARK-23271 Parquet output contains only _SUCCESS...

Reply via email to