GitHub user misutoth opened a pull request:
https://github.com/apache/spark/pull/22532
[SPARK-20845][SQL] Support specification of column names in INSERT INTO
command.
## What changes were proposed in this pull request?
One can specify a list of columns for an INSERT INTO command. The columns
shall be listed in parenthesis just following the table name. Query columns are
then matched to this very same order.
```
scala> sql("CREATE TABLE t (s string, i int)")
scala> sql("INSERT INTO t values ('first', 1)")
scala> sql("INSERT INTO t (i, s) values (2, 'second')")
scala> sql("SELECT * FROM t").show
+------+---+
| s| i|
+------+---+
| first| 1|
|second| 2|
+------+---+
scala>
```
In the above example the _second_ insertion utilizes the new functionality.
The number and its associated string is given in reverse order `(2, 'second')`
according to the column list specified for the table `(i, s)`. The result can
be seen at the end of the command list. Intermediate output of the commands are
omitted for the sake of brevity.
## How was this patch tested?
InsertSuite (both in source and in hive sub-packages) were extended with
tests exercising specification of column names listing in INSERT INTO commands.
Also ran the above sample, and ran tests in `sql`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/misutoth/spark insert-into-columns
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22532.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22532
----
commit 1dda672d336b906ecc133f468435b4cf38859e2d
Author: Mihaly Toth <misutoth@...>
Date: 2018-03-20T06:13:01Z
[SPARK-20845][SQL] Support specification of column names in INSERT INTO
command.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]