[
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lefty Leverenz updated HIVE-9481:
---------------------------------
Labels: (was: TODOC1.2)
> allow column list specification in INSERT statement
> ---------------------------------------------------
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
> Issue Type: Bug
> Components: Parser, Query Processor, SQL
> Affects Versions: 0.14.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Fix For: 1.2.0
>
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch,
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into
> FOO(c,b) select x,y from T. The expectation is that 'x' is written to column
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a'
> is NULLABLE.
> Hive does not support this. In Hive one has to ensure that the data
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when
> target schema is explicitly provided, missing columns will be set to NULL if
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source
> insert into target(y,z) select null as x, *
> insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
> PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link)
>
> VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link)
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table
> which are matched by position to the values produced by VALUES clause/SELECT
> statement. If the producer side provides values for a dynamic partition
> column, the column should be in the specified schema. Static partition
> values are part of the partition spec and thus are not produced by the
> producer and should not be part of the schema specification.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)