wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
......................................................................


Patch Set 20:

(9 comments)

> (10 comments)
 >
 > One thing I didn't understand about the patch is the support for
 > partition pruning. I don't see enough code here to handle all the
 > edge cases - partition evolution, different partition transforms,
 > etc.
 >
 > I'm OK with deferring optimizations to a later patch, but want to
 > make sure that this patch is correct and that we document what the
 > limitations are.

Hi Tim, thanks for your patient review. This patch pruning partition mainly by 
pushdown predicates to Iceberg to filter data files. I've already add more 
comment about predicates pushdown. And all test cases passed: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11658/

http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@611
PS18, Line 611:     // Iceberg table only has one partition spec now
> What does this mean? We don't support partition evolution yet?
When we create iceberg table with several partition columns, such as 'id 
identity', 'register_time day', these partition columns are treated as 
partition fields, all in a partition spec which is different from hdfs table. 
So I add a comment here: Iceberg table only has one partition.
If we alter iceberg table's partitions, iceberg will generated snapshot with 
new partition spec, we can select different snapshot to query by iceberg api 
which is not supported in this patch. We only get the latest partition spec for 
latest data now, so we think iceberg table only has one partition spec in 
impala(latest partition spec)


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@615
PS18, Line 615: fieldName
> nit: fieldName
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1889
PS18, Line 1889: protec
> Can you make this protected instead of public, since it's only being used w
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@372
PS18, Line 372: protec
> I think this can be protected
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@61
PS18, Line 61: 
> nit: icebergConjuncts_
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@94
PS18, Line 94:
> nit: Iceberg
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@123
PS18, Line 123:     ListIterator<Expr> it = conjuncts_.listIterator();
> I think this predicate pushdown needs a bit more explanation.
Yes you are right, Tim. In case 1 and case 2, we need to evaluate predicates in 
the scan as well. After discussed with my colleague worked on Iceberg, we 
prepare to pushdown all predicates to Iceberg, and evaluate all predicates in 
the scan as well. More details you can see the comment on 
IcebergScanNode.extractIcebergConjuncts


http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/data/README@508
PS18, Line 508: hll_sketches_from_impala.parquet:
> nit: remove this merge marker
Done


http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test:

http://gerrit.cloudera.org:8080/#/c/16143/18/testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test@3
PS18, Line 3: Iceberg
> nit: Iceberg
Done



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 20
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Thu, 13 Aug 2020 02:13:41 +0000
Gerrit-HasComments: Yes

Reply via email to