Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21190


Change subject: IMPALA-12894: Optimized count(*) for Iceberg gives wrong 
results after a Spark rewrite_data_files
......................................................................

IMPALA-12894: Optimized count(*) for Iceberg gives wrong results after a Spark 
rewrite_data_files

Impala can return incorrect results if a table has dangling delete
files. During analysis we check the existence of delete files
based on the snapshot summary. But during planning in IcebergScanPlanner
we do it based on planFiles(), i.e. dangling delete files don't count
in the latter case. Because of this Impala can create incorrect
plans for count(*) optimization.

This patch fixes the FeIcebergTable.hasDeleteFiles() method, so it
ignores dangling delete files.

TODO:
 * introduce query option so we can completely disable the count(*) optimization

Testing:
 * e2e tests
 * planner tests

Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
7 files changed, 307 insertions(+), 430 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/21190/1
--
To view, visit http://gerrit.cloudera.org:8080/21190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f
Gerrit-Change-Number: 21190
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to