[
https://issues.apache.org/jira/browse/IMPALA-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltán Borók-Nagy resolved IMPALA-14993.
----------------------------------------
Fix Version/s: Impala 5.0.0
Resolution: Fixed
> Iceberg V2 count(*) optimization is incorrectly applied to queries without
> count(*), causing row loss
> -----------------------------------------------------------------------------------------------------
>
> Key: IMPALA-14993
> URL: https://issues.apache.org/jira/browse/IMPALA-14993
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.5.0
> Reporter: Dmitriy Maslov
> Assignee: Zoltán Borók-Nagy
> Priority: Major
> Labels: iceberg, impala-iceberg, impala-iceberg-active-backlog
> Fix For: Impala 5.0.0
>
>
> On Iceberg V2 tables that contain delete files, queries without {{count(*)}}
> in the select list (e.g. {{{}SELECT 1 FROM tbl{}}}) silently return fewer
> rows than they should.
> h3. Steps to reproduce
> {{CREATE TABLE ice1 (id INT, c1 INT)}}
> {{STORED AS ICEBERG TBLPROPERTIES ('format-version' = '2');}}
> {{INSERT INTO ice1 SELECT 1, 10;}}
> {{INSERT INTO ice1 SELECT 2, 20;}}
> {{DELETE FROM ice1 WHERE id = 1;}}
> {{SELECT 1 FROM ice1; – expected: 1 row, actual: 0 rows}}
> h3. Root cause
> {{SelectStmt.optimizePlainCountStarQueryV2()}} decides to enable the
> optimization based on a loop that _rejects_ anything that is not {{count(*)}}
> or a constant - but never checks that at least one {{count(*)}} is actually
> present. For {{SELECT 1 FROM ice1}} the loop accepts the constant and falls
> through, setting {{{}tableRef.setOptimizeCountStarForIcebergV2(true){}}}.
> h3. Proposed fix
> Implement the protection in method V2 in a similar way to method V1, by
> adding the hasCountStarFunc flag in file
> fe/src/main/java/org/apache/impala/analysis/SelectStmt.java -
> optimizePlainCountStarQueryV2() :
> {{boolean hasCountStarFunc = false;}}
> {{boolean alreadyOptimized = false;}}
> {{for (SelectListItem selectItem : getSelectList().getItems()) {}}
> {{ Expr expr = selectItem.getExpr();}}
> {{ if (expr == null) return;}}
> {{ if (expr.isConstant()) continue;}}
> {{ if (expr instanceof IcebergV2CountStarAccumulator) {}}
> {{ alreadyOptimized = true;}}
> {{ continue;}}
> {{ if (!FunctionCallExpr.isCountStarFunctionCallExpr(expr)) return;}}
> {{ hasCountStarFunc = true;}}
> {{}}}
> {{if (!hasCountStarFunc && !alreadyOptimized) return;}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)