Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16446 )
Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table ...................................................................... Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test: http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test@80 PS13, Line 80: row_regex:.*CAUSED BY: NoSuchTableException: Table does not exist: $DATABASE.iceberg_test7* Since the test only drops 'iceberg_test6' I think the expected behavior is that 'iceberg_test7' still exists. I'm OK with doing this in a separate patch and Jira, but in that case we should prohibit dropping Iceberg tables in this patch because this behavior is unexpected and can cause potential data loss. We should probably use the dropTable (https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/hadoop/HadoopCatalog.java#L183) method to drop the table from the Iceberg catalog, then drop the table from HMS, e.g. like here: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1918-L1920 But for IcebergTables we need to set the param 'purge' to false (Because the data files are already been purged by Iceberg. Also, the location parameter might points to the catalog location instead of the table location, and we don't want to remove the whole catalog). So I think in the end you might want to add a dropTable() method to IcebergCatalogOpExecutor that drops Iceberg tables properly, both from Iceberg and HMS. http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test: http://gerrit.cloudera.org:8080/#/c/16446/13/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test@305 PS13, Line 305: 'Location: ','$NAMENODE/test-warehouse/iceberg_test/hadoop_catalog/hadoop_catalog_test','NULL' Can we show the actual table location (https://github.com/apache/iceberg/blob/ada047be14205a88396e0dd4444f8b0650a9f44c/api/src/main/java/org/apache/iceberg/Table.java#L94), not the HadoopCatalog location here? So I think in the end we want the following behavior: * DROP TABLE only drops the referred table. If the table is truly external (so it doesn't have 'external.table.purge' property), we only drop the table from HMS, keeping the data files. * SHOW CREATE TABLE shows the CREATE TABLE statement that created the table, so for location it shows the hadoop catalog location * DESCRIBE FORMATTED shows the actual table location -- To view, visit http://gerrit.cloudera.org:8080/16446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef Gerrit-Change-Number: 16446 Gerrit-PatchSet: 13 Gerrit-Owner: wangsheng <sky...@163.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: wangsheng <sky...@163.com> Gerrit-Comment-Date: Thu, 24 Sep 2020 17:24:27 +0000 Gerrit-HasComments: Yes