Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19547 )

Change subject: IMPALA-11950: Planner change for Iceberg metadata querying
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/Path.java
File fe/src/main/java/org/apache/impala/analysis/Path.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/analysis/Path.java@510
PS1, Line 510:     org.apache.hadoop.hive.metastore.api.Table msTable = 
rootTable_.getMetaStoreTable();
Do you use this 'msTable' anywhere?


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@332
PS1, Line 332: RANDOM
Isn't DataPartition.UNPARTITIONED is more suitable for metadata table scan 
nodes?


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java@73
PS1, Line 73:
nit: trailing spaces


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2223
PS1, Line 2223: Metadta
typo


http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/19547/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1288
PS1, Line 1288: metadtat
nit: typo


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test:

http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@1
PS1, Line 1: explain SELECT * FROM 
functional_parquet.iceberg_alltypes_part_orc.history
Just a general comment to have coverage for most of the metadata tables here.

Additionally, experimenting with nested columns and how they are actually 
represented as tuple/slotdescriptors and how they affect the row-size would be 
also beneficial here, in my opinion.


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@6
PS1, Line 6: row-size=33B
It might be a valid use case to do some projection of the columns and check if 
the row-size decreases accordingly.

Additionally, if we add the same column multiple times in the select list, the 
row-size shouldn't grow as we can do a single read for the redundant columns.


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@7
PS1, Line 7: ---- DISTRIBUTEDPLAN
This line just made me wondering if it makes any sense to run a metadata query 
scan in a distributed way. I guess the whole query would be executed on one 
node, right?


http://gerrit.cloudera.org:8080/#/c/19547/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test@29
PS1, Line 29: row-size=0B
Is it correct that the row-size for the scans here are zero while the join has 
a row-size of 66? I'd expect that the scans would have a row size of 33, 
otherwise there won't be any space in the row batch to populate the scan 
results to.
I'd check what slots are materialized on these nodes.



--
To view, visit http://gerrit.cloudera.org:8080/19547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3675d7a57ca570bfec306798589b5ef6aa34b5c6
Gerrit-Change-Number: 19547
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 02 Mar 2023 14:35:23 +0000
Gerrit-HasComments: Yes

Reply via email to