[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 19:28:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables This patch adds support for CREATE TABLE AS SELECT statements for Iceberg tables. CTAS statements work like the following in Impala: 1. Analysis of the whole CTAS statement 2. Divide CTAS to CREATE stmt and INSERT stmt 3. Create temporary in-memory target table from the CREATE stmt 4. Analyse the INSERT statement by using the temporary target table 5. If everything is OK so far, create the target table 6. Execute the INSERT query For Iceberg tables the non-trivial thing was to create the temporary target table without actually creating it via Iceberg API. I've created a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be used with catalog V1 and V2 as well. Testing * e2e CTAS tests in iceberg-ctas.test * SHOW CREATE TABLE stmts in show-create-table.test Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Reviewed-on: http://gerrit.cloudera.org:8080/17130 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/metadata/test_show_create_table.py M tests/query_test/test_iceberg.py 18 files changed, 686 insertions(+), 46 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6960/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 10:14:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 3: Code-Review+1 Thanks for change, this patch LGTM! -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 07:33:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8330/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 10 Mar 2021 13:31:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 2: (3 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@233 PS1, Line 233: Preconditions.checkState(createStmt_.getIcebergPartitionSpecs().size() == 1); > Maybe we can add a comment here for this check. Done http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java: http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@243 PS2, Line 243: transfromToTHdfsTable > typo: transform Done http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@262 PS2, Line 262: THdfsTable hdfsTable = new THdfsTable(localFsTable.getHdfsBaseDir(), : getColumnNames(), localFsTable.getNullPartitionKeyValue(), : FeFsTable.DEFAULT_NULL_COLUMN_VALUE, idToPartition, tPrototypePartition); : return hdfsTable; > Maybe we can return this new THdfsTable directly. Done -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 10 Mar 2021 13:11:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17130 to look at the new patch set (#3). Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables This patch adds support for CREATE TABLE AS SELECT statements for Iceberg tables. CTAS statements work like the following in Impala: 1. Analysis of the whole CTAS statement 2. Divide CTAS to CREATE stmt and INSERT stmt 3. Create temporary in-memory target table from the CREATE stmt 4. Analyse the INSERT statement by using the temporary target table 5. If everything is OK so far, create the target table 6. Execute the INSERT query For Iceberg tables the non-trivial thing was to create the temporary target table without actually creating it via Iceberg API. I've created a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be used with catalog V1 and V2 as well. Testing * e2e CTAS tests in iceberg-ctas.test * SHOW CREATE TABLE stmts in show-create-table.test Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/metadata/test_show_create_table.py M tests/query_test/test_iceberg.py 18 files changed, 686 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/17130/3 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 2: (3 comments) Sorry for late reply. Thanks for new feature, only few nits. http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@233 PS1, Line 233: Preconditions.checkState(createStmt_.getIcebergPartitionSpecs().size() == 1); Maybe we can add a comment here for this check. http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java: http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@243 PS2, Line 243: transfromToTHdfsTable typo: transform http://gerrit.cloudera.org:8080/#/c/17130/2/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@262 PS2, Line 262: THdfsTable hdfsTable = new THdfsTable(localFsTable.getHdfsBaseDir(), : getColumnNames(), localFsTable.getNullPartitionKeyValue(), : FeFsTable.DEFAULT_NULL_COLUMN_VALUE, idToPartition, tPrototypePartition); : return hdfsTable; Maybe we can return this new THdfsTable directly. -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 10 Mar 2021 06:39:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8320/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 09 Mar 2021 18:45:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 1: (7 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/17130/1/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/17130/1/be/src/exec/hdfs-table-sink.cc@545 PS1, Line 545: new HdfsParquetTableWriter( > I'm a bit confused here. This is for Iceberg with Parquet as file format, b This is only the write side. But now that I'm setting the file format for the prototype partition in IcebergCtasTarget this line is no longer needed. http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@502 PS1, Line 502: IcebergPartitionSpec partSpec = ((FeIcebergTable)table_).getDefaultPartitionSpec(); > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@504 PS1, Line 504: throw new AnalysisException("PARTITION clause is only valid for INSERT into " + > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@514 PS1, Line 514: if (isStaticPartitioning) { : throw new AnalysisException("Static partitioning is not supported for " + : "Iceberg tables."); : } > You can move this check inside the for loop above. Done http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java File fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java@48 PS1, Line 48: // colsByPos[i] refers to the ith column in the table. The first numClusteringCols are > nit: indentation Done http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@74 PS1, Line 74: private final FeDb db_; : private final org.apache.hadoop.hive.metastore.api.Table msTable_; > CtasTargetTble already has members with the same name, abs as I see they ge Thanks, removed them from this class. http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@222 PS1, Line 222: public void addColumn(Column col) { > Can't you use IcebergColumn type for the param? You could get rid of the DC Done -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 09 Mar 2021 18:27:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Hello Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17130 to look at the new patch set (#2). Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables This patch adds support for CREATE TABLE AS SELECT statements for Iceberg tables. CTAS statements work like the following in Impala: 1. Analysis of the whole CTAS statement 2. Divide CTAS to CREATE stmt and INSERT stmt 3. Create temporary in-memory target table from the CREATE stmt 4. Analyse the INSERT statement by using the temporary target table 5. If everything is OK so far, create the target table 6. Execute the INSERT query For Iceberg tables the non-trivial thing was to create the temporary target table without actually creating it via Iceberg API. I've created a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be used with catalog V1 and V2 as well. Testing * e2e CTAS tests in iceberg-ctas.test * SHOW CREATE TABLE stmts in show-create-table.test Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/metadata/test_show_create_table.py M tests/query_test/test_iceberg.py 18 files changed, 685 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/17130/2 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 1: (5 comments) Nice work, Zoltan! I left some comments, nothing too serious. http://gerrit.cloudera.org:8080/#/c/17130/1/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/17130/1/be/src/exec/hdfs-table-sink.cc@545 PS1, Line 545: new HdfsParquetTableWriter( I'm a bit confused here. This is for Iceberg with Parquet as file format, but don't we support other formats such as ORC? http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@514 PS1, Line 514: if (isStaticPartitioning) { : throw new AnalysisException("Static partitioning is not supported for " + : "Iceberg tables."); : } You can move this check inside the for loop above. http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java File fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java@48 PS1, Line 48: // colsByPos[i] refers to the ith column in the table. The first numClusteringCols are nit: indentation http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java File fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@74 PS1, Line 74: private final FeDb db_; : private final org.apache.hadoop.hive.metastore.api.Table msTable_; CtasTargetTble already has members with the same name, abs as I see they get the same value as well. Are these needed here? http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java@222 PS1, Line 222: public void addColumn(Column col) { Can't you use IcebergColumn type for the param? You could get rid of the DCHECK in L223 then. -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 02 Mar 2021 15:19:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8253/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 26 Feb 2021 17:26:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@502 PS1, Line 502: IcebergPartitionSpec partSpec = ((FeIcebergTable)table_).getDefaultPartitionSpec(); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17130/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@504 PS1, Line 504: throw new AnalysisException("PARTITION clause is only valid for INSERT into " + line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 26 Feb 2021 17:08:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17130 Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables This patch adds support for CREATE TABLE AS SELECT statements for Iceberg tables. CTAS statements work like the following in Impala: 1. Analysis of the whole CTAS statement 2. Divide CTAS to CREATE stmt and INSERT stmt 3. Create temporary in-memory target table from the CREATE stmt 4. Analyse the INSERT statement by using the temporary target table 5. If everything is OK so far, create the target table 6. Execute the INSERT query For Iceberg tables the non-trivial thing was to create the temporary target table without actually creating it via Iceberg API. I've created a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be used with catalog V1 and V2 as well. Testing * e2e CTAS tests in iceberg-ctas.test * SHOW CREATE TABLE stmts in show-create-table.test Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 --- M be/src/exec/hdfs-table-sink.cc M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/metadata/test_show_create_table.py M tests/query_test/test_iceberg.py 18 files changed, 698 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/17130/1 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy