Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/21190

to look at the new patch set (#3).

Change subject: IMPALA-12894: Optimized count(*) for Iceberg gives wrong 
results after a Spark rewrite_data_files
......................................................................

IMPALA-12894: Optimized count(*) for Iceberg gives wrong results after a Spark 
rewrite_data_files

Impala can return incorrect results if a table has dangling delete
files. During analysis we check the existence of delete files
based on the snapshot summary. But during planning in IcebergScanPlanner
we do it based on planFiles(), i.e. dangling delete files don't count
in the latter case. Because of this Impala can create incorrect
plans for count(*) optimization.

This patch fixes the FeIcebergTable.hasDeleteFiles() method, so it
ignores dangling delete files. It also introduces a new query option,
"iceberg_disable_count_star_optimization", so users can completely
disable the statistic-based count(*)-optimization if necessary.

Testing:
 * e2e tests
 * planner tests

Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables-hash-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
11 files changed, 336 insertions(+), 433 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/21190/3
--
To view, visit http://gerrit.cloudera.org:8080/21190
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f
Gerrit-Change-Number: 21190
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>

Reply via email to