[
https://issues.apache.org/jira/browse/IMPALA-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119928#comment-17119928
]
Tim Armstrong commented on IMPALA-2005:
---------------------------------------
Updated the title. I think the original problem statement and implementation
was flawed. If you look at Hive's behaviour, the table is not visible until the
query has finished executing. So it avoids the need to clean up something after
creating it.
E.g. try this in hive:
{noformat}
create table test_ctas as select * from alltypestiny where assert_true(id = 1)
{noformat}
I'm not sure if it's strictly transactional.
As far as the original patch goes, that was too late in the query lifecycle for
this to be workable - the table should have been deleted immediately after the
error in query execution instead of when the client closed the query. I think
in the current code, it's a lot cleaner. Probably
CatalogOpExecutor.updateCatalog() or nearby is the right place to do this.
> CTAS should not create table until insert succeeds
> --------------------------------------------------
>
> Key: IMPALA-2005
> URL: https://issues.apache.org/jira/browse/IMPALA-2005
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 1.4, Impala 1.4.1, Impala 2.0, Impala 2.0.1,
> Impala 2.1, Impala 2.1.1, Impala 2.2
> Reporter: Tomoaki Yano
> Priority: Major
> Labels: correctness, downgraded, ramp-up
>
> I checked create table like the following fails.
> (parquet_file_size is sample to make create table fail)
> sudo -u impala impala-shell -q "set parquet_file_size=10g; create table
> 59915_test stored as parquet as select * from sample_08;"
> But table/folder is created as followings.
> I think failed DDL should clean up table and its directory.
> Terminal Log
> --
> [root@nightly-2 ~]# sudo -u impala impala-shell -q "set
> parquet_file_size=10g; create table 59915_test stored as parquet as select *
> from sample_08;"
> Starting Impala Shell without Kerberos authentication
> Connected to nightly-2.ent.cloudera.com:21000
> Server version: impalad version 2.3.0-cdh5-INTERNAL RELEASE (build
> 03b1ea79d41c49617be46313ffe0f5f05b1b54f5)
> PARQUET_FILE_SIZE set to 10g
> Query: create table 59915_test stored as parquet as select * from sample_08
> WARNINGS: Failed to open HDFS file for writing:
> hdfs://nightly-1.ent.cloudera.com:8020/user/hive/warehouse/59915_test/_impala_insert_staging/d945ce65feede9a4_355731418cd694b9//.d945ce65feede9a4-355731418cd694ba_1653921013_dir/d945ce65feede9a4-355731418cd694ba_1805782816_data.0.parq
> Error(255): Unknown error 255
> :
> [nightly-2.ent.cloudera.com:21000] > select * from 59915_test;
> Query: select * from 59915_test
> Fetched 0 row(s) in 2.04s
> --
> *Workaround*
> The workaround is to manually delete the extraneous table and the
> corresponding HDFS directory.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]