wangsheng has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 )
Change subject: IMPALA-9741: Support querying Iceberg table by impala ...................................................................... Patch Set 20: (9 comments) > (10 comments) > > One thing I didn't understand about the patch is the support for > partition pruning. I don't see enough code here to handle all the > edge cases - partition evolution, different partition transforms, > etc. > > I'm OK with deferring optimizations to a later patch, but want to > make sure that this patch is correct and that we document what the > limitations are. Hi Tim, thanks for your patient review. This patch pruning partition mainly by pushdown predicates to Iceberg to filter data files. I've already add more comment about predicates pushdown. And all test cases passed: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11658/ http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@611 PS18, Line 611: // Iceberg table only has one partition spec now > What does this mean? We don't support partition evolution yet? When we create iceberg table with several partition columns, such as 'id identity', 'register_time day', these partition columns are treated as partition fields, all in a partition spec which is different from hdfs table. So I add a comment here: Iceberg table only has one partition. If we alter iceberg table's partitions, iceberg will generated snapshot with new partition spec, we can select different snapshot to query by iceberg api which is not supported in this patch. We only get the latest partition spec for latest data now, so we think iceberg table only has one partition spec in impala(latest partition spec) http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@615 PS18, Line 615: fieldName > nit: fieldName Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1889 PS18, Line 1889: protec > Can you make this protected instead of public, since it's only being used w Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@372 PS18, Line 372: protec > I think this can be protected Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@61 PS18, Line 61: > nit: icebergConjuncts_ Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@94 PS18, Line 94: > nit: Iceberg Done http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@123 PS18, Line 123: ListIterator<Expr> it = conjuncts_.listIterator(); > I think this predicate pushdown needs a bit more explanation. Yes you are right, Tim. In case 1 and case 2, we need to evaluate predicates in the scan as well. After discussed with my colleague worked on Iceberg, we prepare to pushdown all predicates to Iceberg, and evaluate all predicates in the scan as well. More details you can see the comment on IcebergScanNode.extractIcebergConjuncts http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README File testdata/data/README: http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README@508 PS18, Line 508: hll_sketches_from_impala.parquet: > nit: remove this merge marker Done http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test: http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test@3 PS18, Line 3: Iceberg > nit: Iceberg Done -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 20 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Anonymous Coward (606) Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Thu, 13 Aug 2020 02:13:41 +0000 Gerrit-HasComments: Yes
