CTAS with PARTITION BY fails when selecting from view

John Russell Fri, 26 Jan 2018 17:06:57 -0800

Hi all,

I wanted to experiment with a few different partitioning layouts, so I did CTAS 
statements with PARTITION BY clauses from various tables and views in my 
schema.  One CTAS failed like so, with a “duplicate column” error:


[localhost:21000] > desc report_categories;
+----------------+-----------+---------+
| name           | type      | comment |
+----------------+-----------+---------+
| ip             | string    |         |
| f2             | string    |         |
| f3             | string    |         |
| the_date       | timestamp |         |
| method         | string    |         |
| path           | string    |         |
| status         | smallint  |         |
| size           | bigint    |         |
| referer        | string    |         |
| agent          | string    |         |
| is_search_term | boolean   |         |
| search_term    | string    |         |
| is_doc_page    | boolean   |         |
| doc_page       | string    |         |
| category       | string    |         |
| version        | string    |         |
| format         | string    |         |
| yy             | int       |         |
| mm             | int       |         |
| dd             | int       |         |
+----------------+-----------+---------+
[localhost:21000] > create table report_categories_by_status_format_yy_mm
partitioned by (`status`, `format`, yy, mm) stored as parquet
as
select ip, f2, f3, the_date, method, path, size, referer, agent, is_search_term,
search_term, is_doc_page, doc_page, category, version, dd, `status`, `format`, 
yy, mm
from report_categories;
ERROR: AnalysisException: Duplicate column name: status

The interesting thing is, REPORT_CATEGORIES is a view that SELECTs a bunch of 
columns (including one named `STATUS`) via SELECT * from an underlying table.  
If I make a real table with the same column definitions as the REPORT_CATEGORY 
view, then the above CTAS works when selecting from the real table.

Is it to be expected that the use of a column X in a view definition would 
prevent a CTAS from creating a partitioned table with X as one of the partition 
key columns?  (This is on Impala 2.7 BTW.)

Thanks,
John

CTAS with PARTITION BY fails when selecting from view

Reply via email to