GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/16583

    [SPARK-19129] [SQL] SessionCatalog: Disallow empty part col values in 
partition spec

    ### What changes were proposed in this pull request?
    Empty partition column values are not valid for partition specification. 
Before this PR, we accept users to do it; however, Hive metastore does not 
detect and disallow it too. Thus, users hit the following strange error.
    
    ```Scala
    val df = spark.createDataFrame(Seq((0, "a"), (1, "b"))).toDF("partCol1", 
"name")
    
df.write.mode("overwrite").partitionBy("partCol1").saveAsTable("partitionedTable")
    spark.sql("alter table partitionedTable drop partition(partCol1='')")
    spark.table("partitionedTable").show()
    ```
    
    In the above example, the WHOLE table is DROPPED when users specify a 
partition spec containing only one partition column with empty values. 
    
    When the partition columns contains more than one, Hive metastore APIs 
simply ignore the columns with empty values and treat it as partial spec. This 
is also not expected. This does not follow the actual Hive behaviors. This PR 
is to disallow users to specify such an invalid partition spec in the 
`SessionCatalog` APIs.   
    
    ### How was this patch tested?
    Added test cases

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark disallowEmptyPartColValue

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16583.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16583
    
----
commit c1cdcad23dfacc11cb97489e468c74266cc7d57e
Author: gatorsmile <[email protected]>
Date:   2017-01-14T00:24:49Z

    fix.

commit 4d32864bc29534027d3d04df58ff445e89cd0d2f
Author: gatorsmile <[email protected]>
Date:   2017-01-14T00:32:00Z

    fix message.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to