Riza Suminto created IMPALA-12790:
-------------------------------------
Summary: ScanNode.getInputCardinality can overestimate if LIMIT is
large.
Key: IMPALA-12790
URL: https://issues.apache.org/jira/browse/IMPALA-12790
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Riza Suminto
Assignee: Riza Suminto
The bug is first found in
[https://gerrit.cloudera.org/c/20993/1/fe/src/main/java/org/apache/impala/planner/ScanNode.java#338]
Simple scan query can have ScanNode.getInputCardinality() return larger number
than it should be if the query has LIMIT larger than table cardinality. This
bug is visible in following test query with low EXEC_SINGLE_NODE_ROWS_THRESHOLD
option set:
{code:java}
Section DISTRIBUTEDPLAN of query at line 451:
select * from functional_kudu.tinytable limit 1000;
Actual does not match expected result:
PLAN-ROOT SINK
|
01:EXCHANGE [UNPARTITIONED]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
| limit: 1000
|
00:SCAN KUDU [functional_kudu.tinytable]
limit: 1000
row-size=43B cardinality=3
Expected:
PLAN-ROOT SINK
|
00:SCAN KUDU [functional_kudu.tinytable]
limit: 1000
row-size=43B cardinality=3 {code}
The distributed plan should not have EXCHANGE added since it is a small query
(cardinality=3) and can run in coordinator only.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]