Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18847 )
Change subject: IMPALA-11484: Create SCAN plan for Iceberg V2 position delete tables ...................................................................... Patch Set 5: (14 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@559 PS4, Line 559: allFiles > nit: This is a pair of List<DataFile> and Set<DeleteFile>, maybe we should Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@48 PS4, Line 48: /** > nit: "table is created" Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@50 PS4, Line 50: e > nit: "of" Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@129 PS4, Line 129: _ID = 2147 > Just want to know: where does this number come from? These are coming from the Iceberg spec https://iceberg.apache.org/spec/#position-delete-files http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/VirtualTable.java File fe/src/main/java/org/apache/impala/catalog/VirtualTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/VirtualTable.java@41 PS4, Line 41: that > nit: "that" Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@35 PS4, Line 35: import org.apache.impala.catalog.IcebergColumn; : import org.apache.impala.catalog.IcebergTable; > nit: It may be unnecessary. Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@39 PS4, Line 39: import org.apache.impala.catalog.Virtu > ditto. Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java@48 PS4, Line 48: import org.apache.impala.catalog.Virtu > nit: it may be unnecessary. Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@204 PS4, Line 204: .add(HdfsFileFormat.ICEBERG) > Doesn't iceberg tables returns the underlying file format? At the coordinator-side, yes. But at CatalogD-side and in PlannerTests it is Iceberg. In old catalog mode we do another hack to get the underlying fileformat, i.e. during thrift conversion we substitute the file format: https://github.com/apache/impala/blob/bf103414012ba7596563c5f3292380f8b11524e9/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java#L487 It was an easier change then adding another hack to HdfsPartition. Btw, don't we need this change anyway to handle mixed format tables? http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java: http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@93 PS4, Line 93: > nit: should be IcebergScanPlanner. Done http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@125 PS4, Line 125: blRef; > nit: I think this name is a bit confusing, as we do have an IcebergScanNode Done. Renamed createIcebergScanNode -> createIcebergScanPlan createIcebergScanPlan -> createComplexIcebergScanPlan http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@197 PS4, Line 197: > magic number I got rid of this code by calculating proper stats for the delete files and let Impala's planner figure out the distribution mode. http://gerrit.cloudera.org:8080/#/c/18847/4/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@202 PS4, Line 202: JoinNode joinNode = new HashJoinNode(dataScanNode, deleteScanNode, : /*straight_join=*/true, DistributionMode.NONE, JoinOperator.LEFT_ANTI_JOIN, : positionJoinConjuncts, /*otherJoinConjuncts=*/Collections.emptyList()); : joinNode.setId(ctx_.getNextNodeId()); : joinNode.init(analyzer_); : joinNode.setIsDeleteRowsJ > As I understand this is (similar but) unrelated to ACID, we should not let Done http://gerrit.cloudera.org:8080/#/c/18847/4/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/18847/4/tests/query_test/test_iceberg.py@749 PS4, Line 749: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/18847 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48 Gerrit-Change-Number: 18847 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tamas Mate <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 25 Aug 2022 14:28:03 +0000 Gerrit-HasComments: Yes
