Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19547 )

Change subject: IMPALA-11950: Planner change for Iceberg metadata querying
......................................................................


Patch Set 1:

(17 comments)

Thank you for the reviews, updated the patch.

http://gerrit.cloudera.org:8080/#/c/19547/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19547/1//COMMIT_MSG@9
PS1, Line 9: , this
> Nit: should start a new sentence here.
Done


http://gerrit.cloudera.org:8080/#/c/19547/1//COMMIT_MSG@11
PS1, Line 11: writing the backend
            : part
> Is it going to be in another patch or this one? If another one, can you ind
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/Path.java
File fe/src/main/java/org/apache/impala/analysis/Path.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/Path.java@510
PS1, Line 510:     org.apache.hadoop.hive.metastore.api.Table msTable = 
rootTable_.getMetaStoreTable();
> Do you use this 'msTable' anywhere?
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/TableName.java
File fe/src/main/java/org/apache/impala/analysis/TableName.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/TableName.java@59
PS1, Line 59:     Preconditions.checkNotNull(tbl);
> Is it allowed to be empty?
During candidate table creation this constructor is used, so empty table name 
is allowed here.


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java
File 
fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java@52
PS1, Line 52:     metadataTableName_ = tblRefPath.get(2);
> Can there be interference with for example CollectionTableRefs which may ha
This is only called when isIcebergMetadataTable() returns true, so only for 
iceberg tables and when the third part of the table name matches the Iceberg 
metadtat table name.


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@332
PS1, Line 332: RANDOM
> Isn't DataPartition.UNPARTITIONED is more suitable for metadata table scan
It is, nice catch.

While I was testing I wanted to see some more exciting plans, this removes some 
exchange nodes.


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java@36
PS1, Line 36:     desc_ = desc;
> Do we allow a null desc_? If not, we could have a precondition check.
I did not see that for HdfsScanNode/HbaseScanNode so I would keep it this way. 
I assume it cannot be null.


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java@47
PS1, Line 47:     Preconditions.checkNotNull(desc_.getPath());
> If desc_ can be null (see comment on L36), we should also check that it is
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java@73
PS1, Line 73:
> nit: trailing spaces
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2223
PS1, Line 2223: Metadta
> typo
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/service/Frontend.java@2217
PS1, Line 2217: t
> Nit: extra t.
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/service/Frontend.java@2220
PS1, Line 2220: t
> Nit: extra t.
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1288
PS1, Line 1288: metadtat
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test:

http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@1
PS1, Line 1: explain SELECT * FROM 
functional_parquet.iceberg_alltypes_part_orc.history
> Just a general comment to have coverage for most of the metadata tables her
I added coverage on parser side, do you think it would be useful to have plans 
for all the tables as well?

Also, added a plan with a nested type.


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@6
PS1, Line 6: row-size=33B
> It might be a valid use case to do some projection of the columns and check
Added a new test with projection.


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@7
PS1, Line 7: ---- DISTRIBUTEDPLAN
> This line just made me wondering if it makes any sense to run a metadata qu
Yes, the fragment will be executed on one node. I don't think it makes sense at 
this point to split the scanning.


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@29
PS1, Line 29: row-size=0B
> Is it correct that the row-size for the scans here are zero while the join
Nice catch, this is the result of the planning, missed to calculate the stats 
for the nodes. Fixed it.



--
To view, visit http://gerrit.cloudera.org:8080/19547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3675d7a57ca570bfec306798589b5ef6aa34b5c6
Gerrit-Change-Number: 19547
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 10 Mar 2023 16:47:49 +0000
Gerrit-HasComments: Yes

Reply via email to