Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21190
Change subject: IMPALA-12894: Optimized count(*) for Iceberg gives wrong results after a Spark rewrite_data_files ...................................................................... IMPALA-12894: Optimized count(*) for Iceberg gives wrong results after a Spark rewrite_data_files Impala can return incorrect results if a table has dangling delete files. During analysis we check the existence of delete files based on the snapshot summary. But during planning in IcebergScanPlanner we do it based on planFiles(), i.e. dangling delete files don't count in the latter case. Because of this Impala can create incorrect plans for count(*) optimization. This patch fixes the FeIcebergTable.hasDeleteFiles() method, so it ignores dangling delete files. TODO: * introduce query option so we can completely disable the count(*) optimization Testing: * e2e tests * planner tests Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f --- M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test 7 files changed, 307 insertions(+), 430 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/21190/1 -- To view, visit http://gerrit.cloudera.org:8080/21190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f Gerrit-Change-Number: 21190 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>