Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16939 )
Change subject: IMPALA-10432: INSERT INTO Iceberg tables with partition transforms ...................................................................... Patch Set 3: (4 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/16939/2/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/16939/2/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@198 PS2, Line 198: if (unixMicros >= 0) { > This means that if unixMicros<0, then we won't do partition pruning, but qu Yes, I also have some test queries that have predicates with timestamps before the unix epoch. http://gerrit.cloudera.org:8080/#/c/16939/2/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/16939/2/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@186 PS2, Line 186: partSpec.partitionType(), : feIcebergTable.getDefaultPartitionSpec(), dataFile.partitionPath() > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/16939/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: http://gerrit.cloudera.org:8080/#/c/16939/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@976 PS2, Line 976: cast(id * 3 as int), : cast(10000 - id as int > nit: +1 indent? Done http://gerrit.cloudera.org:8080/#/c/16939/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test@984 PS2, Line 984: year > A potential optimization in timestamp based partitioning expressions could This optimization could interfere with SORT BY expressions. E.g. if we want to SORT BY ZORDER (columnA, columnB), then ordering by timestampCol instead of year(timestampCol) would make no room for ZORDERing because the data would be majorly sorted by the timestampCol. If there is no SORT BY, then yes, it could probably make the INSERT more efficient, but it would also complicate the code. I'd rather not do that in this commit. But we'll definitely need thorough performance analysis once the main features are completed. -- To view, visit http://gerrit.cloudera.org:8080/16939 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3edf02048cea78703837b248c55219c22d512b78 Gerrit-Change-Number: 16939 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Thu, 14 Jan 2021 16:44:41 +0000 Gerrit-HasComments: Yes
