Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18847 )

Change subject: IMPALA-11484: Create SCAN plan for Iceberg V2 position delete 
tables
......................................................................


Patch Set 5:

(14 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@559
PS4, Line 559: allFiles
> nit: This is a pair of List<DataFile> and Set<DeleteFile>, maybe we should
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@48
PS4, Line 48: /**
> nit: "table is created"
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@50
PS4, Line 50: e
> nit: "of"
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@129
PS4, Line 129: _ID = 2147
> Just want to know: where does this number come from?
These are coming from the Iceberg spec 
https://iceberg.apache.org/spec/#position-delete-files


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/VirtualTable.java
File fe/src/main/java/org/apache/impala/catalog/VirtualTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/VirtualTable.java@41
PS4, Line 41: that
> nit: "that"
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@35
PS4, Line 35: import org.apache.impala.catalog.IcebergColumn;
            : import org.apache.impala.catalog.IcebergTable;
> nit: It may be unnecessary.
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@39
PS4, Line 39: import org.apache.impala.catalog.Virtu
> ditto.
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@48
PS4, Line 48: import org.apache.impala.catalog.Virtu
> nit: it may be unnecessary.
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@204
PS4, Line 204:           .add(HdfsFileFormat.ICEBERG)
> Doesn't iceberg tables returns the underlying file format?
At the coordinator-side, yes. But at CatalogD-side and in PlannerTests it is 
Iceberg. In old catalog mode we do another hack to get the underlying 
fileformat, i.e. during thrift conversion we substitute the file format:
https://github.com/apache/impala/blob/bf103414012ba7596563c5f3292380f8b11524e9/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java#L487

It was an easier change then adding another hack to HdfsPartition. Btw, don't 
we need this change anyway to handle mixed format tables?


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@93
PS4, Line 93:
> nit: should be IcebergScanPlanner.
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@125
PS4, Line 125: blRef;
> nit: I think this name is a bit confusing, as we do have an IcebergScanNode
Done. Renamed

 createIcebergScanNode -> createIcebergScanPlan
 createIcebergScanPlan -> createComplexIcebergScanPlan


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@197
PS4, Line 197:
> magic number
I got rid of this code by calculating proper stats for the delete files and let 
Impala's planner figure out the distribution mode.


http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@202
PS4, Line 202:     JoinNode joinNode = new HashJoinNode(dataScanNode, 
deleteScanNode,
             :         /*straight_join=*/true, DistributionMode.NONE, 
JoinOperator.LEFT_ANTI_JOIN,
             :         positionJoinConjuncts, 
/*otherJoinConjuncts=*/Collections.emptyList());
             :     joinNode.setId(ctx_.getNextNodeId());
             :     joinNode.init(analyzer_);
             :     joinNode.setIsDeleteRowsJ
> As I understand this is (similar but) unrelated to ACID, we should not let
Done


http://gerrit.cloudera.org:8080/#/c/18847/4/tests/query_test/test_iceberg.py
File tests/query_test/test_iceberg.py:

http://gerrit.cloudera.org:8080/#/c/18847/4/tests/query_test/test_iceberg.py@749
PS4, Line 749:
> flake8: E302 expected 2 blank lines, found 1
Done



--
To view, visit http://gerrit.cloudera.org:8080/18847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48
Gerrit-Change-Number: 18847
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 25 Aug 2022 14:28:03 +0000
Gerrit-HasComments: Yes

Reply via email to