lipeng...@apache.org has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19397 )
Change subject: IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables ...................................................................... IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables This patch implements the migration from Hdfs tables to Iceberg tables. The target Iceberg tables should inherit the location of the original Hdfs tables. For Hdfs tables with lots of partitions, we can use the 'metadata.generator.threads' property to increase the thread concurrency of building the Iceberg metadata according to the data files in the Hdfs tables. We can do that by the following statements: - MIGRATE TABLE <hdfs_tbl> TO ICEBERG; - MIGRATE TABLE <hdfs_tbl> TO ICEBERG TBLPROPERTIES( 'iceberg.catalog' = 'hadoop.catalog'); - MIGRATE TABLE <hdfs_tbl> TO ICEBERG TBLPROPERTIES( 'metadata.generator.threads' = '10'); Hdfs tables must follow those requirements: - external tables - not transactional tables - InputFormat must be either PARQUET, ORC, or AVRO Process of migration: - Child querie 1: Ensure that the Hdfs table is a pure external table. - Child querie 2: Rename the Hdfs table to a temporary table name. - Create an external Iceberg table by Iceberg API using the data of the Hdfs table. - Child querie 3: Create Iceberg table(Hadoop Catalog) inherits the Hdfs table location. - Child querie 4: Drop the temporary Hdfs table. Testing: - Add e2e tests - Add fe UTs Change-Id: I91e6a9cfe099c263f17b5506d6db459b79ad31a5 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/frontend.cc M be/src/service/frontend.h M common/thrift/Frontend.thrift M common/thrift/Types.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java A fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStringBuilder.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java A fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-migrate-from-external-hdfs-tables.test M tests/query_test/test_iceberg.py 23 files changed, 1,005 insertions(+), 55 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/19397/5 -- To view, visit http://gerrit.cloudera.org:8080/19397 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91e6a9cfe099c263f17b5506d6db459b79ad31a5 Gerrit-Change-Number: 19397 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward <lipeng...@apache.org> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>