Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16446 )

Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table
......................................................................


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test:

http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@80
PS13, Line 80: row_regex:.*CAUSED BY: NoSuchTableException: Table does not 
exist: $DATABASE.iceberg_test7*
Since the test only drops 'iceberg_test6' I think the expected behavior is that 
'iceberg_test7' still exists.

I'm OK with doing this in a separate patch and Jira, but in that case we should 
prohibit dropping Iceberg tables in this patch because this behavior is 
unexpected and can cause potential data loss.

We should probably use the dropTable 
(https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java#L183)
 method to drop the table from the Iceberg catalog, then drop the table from 
HMS, e.g. like here: 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1918-L1920

But for IcebergTables we need to set the param 'purge' to false (Because the 
data files are already been purged by Iceberg. Also, the location parameter 
might points to the catalog location instead of the table location, and we 
don't want to remove the whole catalog).

So I think in the end you might want to add a dropTable() method to 
IcebergCatalogOpExecutor that drops Iceberg tables properly, both from Iceberg 
and HMS.


http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test:

http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test@305
PS13, Line 305: 'Location:           
','$NAMENODE/test-warehouse/iceberg_test/hadoop_catalog/hadoop_catalog_test','NULL'
Can we show the actual table location 
(https://github.com/apache/iceberg/blob/ada047be14205a88396e0dd4444f8b0650a9f44c/api/src/main/java/org/apache/iceberg/Table.java#L94),
 not the HadoopCatalog location here?

So I think in the end we want the following behavior:

* DROP TABLE only drops the referred table. If the table is truly external (so 
it doesn't have 'external.table.purge' property), we only drop the table from 
HMS, keeping the data files.

* SHOW CREATE TABLE shows the CREATE TABLE statement that created the table, so 
for location it shows the hadoop catalog location

* DESCRIBE FORMATTED shows the actual table location



--
To view, visit http://gerrit.cloudera.org:8080/16446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef
Gerrit-Change-Number: 16446
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng <sky...@163.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: wangsheng <sky...@163.com>
Gerrit-Comment-Date: Thu, 24 Sep 2020 17:24:27 +0000
Gerrit-HasComments: Yes

Reply via email to