Riza Suminto created IMPALA-12790:
-------------------------------------

             Summary: ScanNode.getInputCardinality can overestimate if LIMIT is 
large.
                 Key: IMPALA-12790
                 URL: https://issues.apache.org/jira/browse/IMPALA-12790
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Riza Suminto
            Assignee: Riza Suminto


The bug is first found in 
[https://gerrit.cloudera.org/c/20993/1/fe/src/main/java/org/apache/impala/planner/ScanNode.java#338]

Simple scan query can have ScanNode.getInputCardinality() return larger number 
than it should be if the query has LIMIT larger than table cardinality. This 
bug is visible in following test query with low EXEC_SINGLE_NODE_ROWS_THRESHOLD 
option set:
{code:java}
Section DISTRIBUTEDPLAN of query at line 451:
select * from functional_kudu.tinytable limit 1000;

Actual does not match expected result:
PLAN-ROOT SINK
|
01:EXCHANGE [UNPARTITIONED]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|  limit: 1000
|
00:SCAN KUDU [functional_kudu.tinytable]
   limit: 1000
   row-size=43B cardinality=3

Expected:
PLAN-ROOT SINK
|
00:SCAN KUDU [functional_kudu.tinytable]
   limit: 1000
   row-size=43B cardinality=3 {code}
The distributed plan should not have EXCHANGE added since it is a small query 
(cardinality=3) and can run in coordinator only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to