Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/21190 to look at the new patch set (#4). Change subject: IMPALA-12894: (part 2) Fix optimized count(*) for Iceberg tables with dangling delete files ...................................................................... IMPALA-12894: (part 2) Fix optimized count(*) for Iceberg tables with dangling delete files Impala can return incorrect results if a table has dangling delete files. Dangling delete files are delete files that are part of the snapshot but they are not applicable to any of the data files. We can have such delete files after Spark's rewrite_data_files action. During analysis we check the existence of delete files based on the snapshot summary. But during planning in IcebergScanPlanner we do it based on planFiles(), i.e. dangling delete files don't count in the latter case. Because of this Impala can create incorrectplans for count(*) optimization. This patch fixes the FeIcebergTable.hasDeleteFiles() method, so it ignores dangling delete files. It also introduces a new query option, "iceberg_disable_count_star_optimization", so users can completely disable the statistic-based count(*)-optimization if necessary. Testing: * e2e tests * planner tests Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test 11 files changed, 336 insertions(+), 433 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/21190/4 -- To view, visit http://gerrit.cloudera.org:8080/21190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f Gerrit-Change-Number: 21190 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>