[jira] [Commented] (HIVE-9481) allow column list specification in INSERT statement

Alan Gates (JIRA) Mon, 09 Feb 2015 13:47:34 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312941#comment-14312941
 ]


Alan Gates commented on HIVE-9481:
----------------------------------

In SemanticAnalyzer.doPhase1 you have a comment "for partitioned tables, should 
we use partition rather than table here?"  No, you're correct to use table, as 
you cannot always know which partition to use (dynamic partitioning) or you may 
be inserting into multiple partitions.

If I read the patch correctly it does not yet support dynamic partitions.  
That's fine, but I didn't see any guards to throw an error if the user attempts 
to use both together.  

You should add some .q file tests to make sure this works end-to-end.

> allow column list specification in INSERT statement
> ---------------------------------------------------
>
>                 Key: HIVE-9481
>                 URL: https://issues.apache.org/jira/browse/HIVE-9481
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser, Query Processor, SQL
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-9481.2.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source insert into target(y,z) select null as x, * insert into 
> target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  VALUES ('jsmith', 'mail.com');
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9481) allow column list specification in INSERT statement

Reply via email to