GitHub user ericl opened a pull request:
https://github.com/apache/spark/pull/16088
[SPARK-18659] [SQL] Incorrect behaviors in overwrite table for datasource
tables
## What changes were proposed in this pull request?
Two bugs are addressed here
1. INSERT OVERWRITE TABLE sometime crashed when catalog partition
management was enabled. This was because when dropping partitions after an
overwrite operation, the Hive client will attempt to delete the partition
files. If the entire partition directory was dropped, this would fail. The PR
fixes this by adding a flag to control whether the Hive client should attempt
to delete files.
2. The static partition spec for OVERWRITE TABLE was not correctly resolved
to the case-sensitive original partition names. This resulted in the entire
table being overwritten if you did not correctly capitalize your partition
names.
cc @yhuai @cloud-fan
## How was this patch tested?
Unit tests. Surprisingly, the existing overwrite table tests did not catch
these edge cases.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ericl/spark spark-18659
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16088.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16088
----
commit f3875ae3f3f034a31ce934ed88b8ed0036e75b8b
Author: Eric Liang <[email protected]>
Date: 2016-11-30T21:59:17Z
Wed Nov 30 13:59:17 PST 2016
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]