Thomas Tauber-Marshall created IMPALA-6710:
----------------------------------------------

             Summary: Docs around INSERT into partitioned tables are misleading
                 Key: IMPALA-6710
                 URL: https://issues.apache.org/jira/browse/IMPALA-6710
             Project: IMPALA
          Issue Type: Bug
          Components: Docs
    Affects Versions: Impala 2.12.0
            Reporter: Thomas Tauber-Marshall


Impala's INSERT statement has an optional "partition" clause where partition 
columns can be specified.

This clause must be used for static partitioning, i.e. where the partition 
value is specified after the column:
{noformat}
> insert into t1 partition(x=10, y='a') select c1 from some_other_table;
{noformat}

But it is not required for dynamic partition, eg. the following inserts are 
equivalent:
{noformat}
> create table test (c string) partitioned by (p int);
> insert into foo (p, c) values (0, 'c');
> insert into foo (c) partition(p) values ('c', 0);
> insert into foo partition(p) values ('c', 0);
{noformat}
and note:
- the columns are inserted into in the order they appear in the SQL, hence the 
order of 'c' and 1 being flipped in the first two examples
- when a partition clause is specified but the other columns are excluded, as 
in the third example, the other columns are treated as though they had all been 
specified before the partition clauses in the SQL

Confusingly, though, the partition columns are required to be mentioned in the 
query in some form, eg:
{noformat}
> insert into foo values ('c', 1);
{noformat}
would be valid for a non-partitioned table, so long as it had a number and 
types of columns that match the values clause, but can never be valid for a 
partitioned table.

The docs around this are not very clear:
http://impala.apache.org/docs/build/html/topics/impala_insert.html
and seem to indicate that partition columns must be specified in the 
"partition" clause, eg. the sentence:
{noformat}
Inserting data into partitioned tables requires slightly different syntax that 
divides the partitioning columns from the others: 
{noformat}
and the examples that follow it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to