GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/20527
[SPARK-23348][SQL] append data using saveAsTable should adjust the data
types
## What changes were proposed in this pull request?
For inserting/appending data to an existing table, Spark should adjust the
data types of the input query according to the table schema, or fail fast if
it's uncastable.
There are several ways to insert/append data: SQL API,
`DataFrameWriter.insertInto`, `DataFrameWriter.saveAsTable`. The first 2 ways
create `InsertIntoTable` plan, and the last way creates `CreateTable` plan.
However, we only adjust input query data types for `InsertIntoTable`, and users
may hit weird errors when appending data using `saveAsTable`. See the JIRA for
the error case.
This PR fixes this bug by adjusting data types for `CreateTable` too.
## How was this patch tested?
new test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark saveAsTable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20527.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20527
----
commit ad19125ab54439979eb4aa2bebf2cb8c9c85551e
Author: Wenchen Fan <wenchen@...>
Date: 2018-02-07T08:34:40Z
append data using saveAsTable should adjust the data types
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]