[ 
https://issues.apache.org/jira/browse/IMPALA-5602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-5602.
------------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.10.0

https://github.com/cloudera/Impala/commit/730f584140f13fd7f7f748dc6779d5954270eb44

IMPALA-5602: Fix query optimization for kudu and datasource tables
Fix a bug where the following queries on kudu and datasource tables
were incorrectly being optimized as a 'small query' and therefore
running on a single node with a single scanner thread:

(A) that have all their predicates pushed to the underlying storage
layer and have a limit
(B) table stats missing + Conditions in (A)

Testing:
Added frontend planner tests.

Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4
Reviewed-on: http://gerrit.cloudera.org:8080/7560
Reviewed-by: Matthew Jacobs <[email protected]>
Tested-by: Impala Public Jenkins

> All predicates pushed to Kudu with limit runs incorrectly as 'small query'
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-5602
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5602
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0
>            Reporter: Matthew Jacobs
>            Assignee: Bikramjeet Vig
>              Labels: kudu
>             Fix For: Impala 2.10.0
>
>
> When Kudu scans have predicates pushed to Kudu, the base implementation of  
> {{ScanNode.getInputCardinality()}} is wrong:
> {code}
>   @Override
>   public long getInputCardinality() {
>     if (getConjuncts().isEmpty() && hasLimit()) return getLimit();
>     return inputCardinality_;
>   }
> {code}
> getConjuncts() won't contain the predicates pushed to Kudu. If such a query 
> has a limit and there aren't any conjuncts applied at the scan node, then 
> this function will return the limit as the input cardinality mistakenly.
> This can result in a query running with the "small query" optimization when 
> it should not be, i.e. it runs on a single node with a single scanner thread.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to