GitHub user cloud-fan opened a pull request:
[SPARK-18675][SQL] CTAS for hive serde table should work for all hive
## What changes were proposed in this pull request?
Before hive 1.1, when inserting into a table, hive will create the staging
directory under a common scratch directory. After the writing is finished, hive
will simply empty the table directory and move the staging directory to it.
After hive 1.1, hive will create the staging directory under the table
directory, and when moving staging directory to table directory, hive will
still empty the table directory, but will exclude the staging directory there.
In `InsertIntoHiveTable`, we simply copy the code from hive 1.2, which
means we will always create the staging directory under the table directory, no
matter what the hive version is. This causes problems if the hive version is
prior to 1.1, because the staging directory will be removed by hive when hive
is trying to empty the table directory.
This PR copies the code from hive 0.13, so that we have 2 branches to
create staging directory. If hive version is prior to 1.1, we'll go to the old
style branch(i.e. create the staging directory under a common scratch
directory), else, go to the new style branch(i.e. create the staging directory
under the table directory)
## How was this patch tested?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark hive-0.13
Alternatively you can review and apply these changes as the patch at:
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16104
Author: Wenchen Fan <wenc...@databricks.com>
CTAS for hive serde table should work for all hive versions
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org