[
https://issues.apache.org/jira/browse/HIVE-22371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059292#comment-17059292
]
Sungwoo edited comment on HIVE-22371 at 3/14/20, 9:49 AM:
----------------------------------------------------------
A workaround is to explicitly specify table properties, e.g.,
{code:sql}
create table call_center
stored as orc
TBLPROPERTIES('transactional'='true', 'transactional_properties'='default')
as select * from SOURCE.call_center;
{code}
was (Author: glapark):
A workaround is to explicitly specify table properties, e.g.,
{code:sql}
create table call_center
stored as ${FILE}
TBLPROPERTIES('transactional'='true', 'transactional_properties'='default')
as select * from ${SOURCE}.call_center;
{code}
> CTAS not working with non-ACID managed tables
> ---------------------------------------------
>
> Key: HIVE-22371
> URL: https://issues.apache.org/jira/browse/HIVE-22371
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Affects Versions: 4.0.0
> Reporter: Jaechang Kim
> Priority: Major
>
> I used Hive commit HIVE-21344 (f16509a5c9187f592c48c253ee001fc3a5e0d508) in
> the master branch, which was committed on 12 Oct.
> When I submit a query below, the query was finished without any errors.
> {code:sql}
> create table call_center
> stored as orc
> as select * from tpcds_text_2.call_center;
> {code}
> However, "select count( * ) from call_center" returned 0, and data in HDFS
> looks strange.
> * Two tables were created, one in the warehouse directory and another in the
> external warehouse directory.
> * Table `call_center` in the external warehouse is empty.
> {code:java}
> > hdfs dfs -du -h $WAREHOUSE_PATH
> 5.0 K 14.9 K $WAREHOUSE_PATH/call_center
> 0 0 $WAREHOUSE_PATH/tpcds_text_2.db
> > hdfs dfs -du -h $EXTERNAL_WAREHOUSE_PATH
> 2.1 G 2.1 G $EXTERNAL_WAREHOUSE_PATH/2
> 0 0 $EXTERNAL_WAREHOUSE_PATH/call_center
> {code}
> After a few hours of digging, I guess this bug was introduced in HIVE-22158,
> which creates every non-ACID managed table in the external warehouse
> directory by default. In the example above, call_center is intended as a
> managed table, but not explicitly specified as ACID. Hence, it should created
> in the external warehouse directory.
> However, the table call_center created in the external warehouse directory is
> empty, while another non-empty table of the same name is created in the
> warehouse directory. This is because in the current implementation, the
> (buggy) compiled query plan proceeds as follows:
> 1. Write data to a temporary directory
> 2. Move the data to the warehouse directory ($WAREHOUSE_PATH/call_center)
> 3. Create a table using data in the warehouse directory
> Without the bug, step 2 would move the data to the external warehouse
> directory, and step 3 would create a table using the data in the external
> warehouse directory. The crux of the problem is that the query compiler
> checks only whether the query does not include the keyword "external" or not.
> In other words, the query compiler should also be aware of the changes made
> in HIVE-22158 and updated accordingly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)