GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/16583
[SPARK-19129] [SQL] SessionCatalog: Disallow empty part col values in
partition spec
### What changes were proposed in this pull request?
Empty partition column values are not valid for partition specification.
Before this PR, we accept users to do it; however, Hive metastore does not
detect and disallow it too. Thus, users hit the following strange error.
```Scala
val df = spark.createDataFrame(Seq((0, "a"), (1, "b"))).toDF("partCol1",
"name")
df.write.mode("overwrite").partitionBy("partCol1").saveAsTable("partitionedTable")
spark.sql("alter table partitionedTable drop partition(partCol1='')")
spark.table("partitionedTable").show()
```
In the above example, the WHOLE table is DROPPED when users specify a
partition spec containing only one partition column with empty values.
When the partition columns contains more than one, Hive metastore APIs
simply ignore the columns with empty values and treat it as partial spec. This
is also not expected. This does not follow the actual Hive behaviors. This PR
is to disallow users to specify such an invalid partition spec in the
`SessionCatalog` APIs.
### How was this patch tested?
Added test cases
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gatorsmile/spark disallowEmptyPartColValue
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16583.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16583
----
commit c1cdcad23dfacc11cb97489e468c74266cc7d57e
Author: gatorsmile <[email protected]>
Date: 2017-01-14T00:24:49Z
fix.
commit 4d32864bc29534027d3d04df58ff445e89cd0d2f
Author: gatorsmile <[email protected]>
Date: 2017-01-14T00:32:00Z
fix message.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]