GitHub user eatoncys opened a pull request:

    https://github.com/apache/spark/pull/23010

    [SPARK-26012][SQL]Null and '' values should not cause dynamic partition 
failure of string types

    ## What changes were proposed in this pull request?
    
    Dynamic partition will fail when both '' and null values are taken as 
dynamic partition values simultaneously.
    For example, the test bellow will fail before this PR:
    
      test("Null and '' values should not cause dynamic partition failure of 
string types") {
        withTable("t1", "t2") {
          spark.range(3).write.saveAsTable("t1")
          spark.sql("select id, cast(case when id = 1 then '' else null end as 
string) as p" +
            " from t1").write.partitionBy("p").saveAsTable("t2")
          checkAnswer(spark.table("t2").sort("id"), Seq(Row(0, null), Row(1, 
null), Row(2, null)))
        }
      }
    
    The error is: 'org.apache.hadoop.fs.FileAlreadyExistsException: File 
already exists'.
    This PR adds exception protection to file conflicts, renaming the file when 
files conflict.
    
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    New added test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/eatoncys/spark dynamicPartition

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23010.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23010
    
----
commit 1f18e2786a26eb64c52925d8ecff2d6a2295ca16
Author: 10129659 <chen.yanshan@...>
Date:   2018-11-12T04:41:53Z

    Null and '' values should not cause dynamic partition failure of string 
types

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to