Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 )
Change subject: IMPALA-9741: Supported query icebreg table by impala ...................................................................... Patch Set 4: (10 comments) Thanks for working on this, it will be a really great addition to Impala! http://gerrit.cloudera.org:8080/#/c/16143/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16143/4//COMMIT_MSG@7 PS4, Line 7: icebreg Iceberg http://gerrit.cloudera.org:8080/#/c/16143/4//COMMIT_MSG@26 PS4, Line 26: PARQUT PARQUET http://gerrit.cloudera.org:8080/#/c/16143/4//COMMIT_MSG@27 PS4, Line 27: Please add a high-level description about what this patch does. http://gerrit.cloudera.org:8080/#/c/16143/4/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/16143/4/common/thrift/CatalogObjects.thrift@510 PS4, Line 510: source_cols_map please add some comment about the fields http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@86 PS4, Line 86: boolean flag = op_ == TShowStatsOp.PARTITIONS ? table_ instanceof FeIcebergTable ? : ((FeIcebergTable) table_).getPartitionColToSourceIdMap().isEmpty() : : table_.getNumClusteringCols() == 0 : false; nit: for readability, please use if statements instead of nested ternary operators http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@69 PS4, Line 69: transfromed transformed http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@71 PS4, Line 71: getFeFsTable Maybe rename to getHdfsTable() ? http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@115 PS4, Line 115: toUpperCase(); : if (format.equalsIgnoreCase nit: toUpperCase() or equalsIgnoreCase() is not needed. http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@246 PS4, Line 246: List<DataFile> dataFileList = new ArrayList<>(); : for (FileScanTask task : scan.planFiles()) { : dataFileList.add(task.file()); : } : return dataFileList; nit: return Lists.newArrayList(scan.planFiles()); http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/test/java/org/apache/impala/customcluster/IcebergTableQueryTest.java File fe/src/test/java/org/apache/impala/customcluster/IcebergTableQueryTest.java: http://gerrit.cloudera.org:8080/#/c/16143/4/fe/src/test/java/org/apache/impala/customcluster/IcebergTableQueryTest.java@42 PS4, Line 42: /** : * Test impala query iceberg table : * impala not supported insert into iceberg table now, so we construct iceberg : * table by iceberg api : */ Instead of writing the Iceberg table each time, can we just check it into the repository then copy it to the HDFS warehouse directory during data loading? We did something similar with Apache Hudi: https://gerrit.cloudera.org/c/14711/ After that you could create end-to-end tests in Python and in ".test" files. E.g.: https://github.com/apache/impala/blob/65722d3e9051d6a08cb1e69fd36a06684745c226/tests/query_test/test_scanners.py#L326-L340 https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/hudi-parquet.test -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 4 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 10 Jul 2020 14:09:12 +0000 Gerrit-HasComments: Yes
