[ 
https://issues.apache.org/jira/browse/IMPALA-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5671:
----------------------------------
    Priority: Trivial  (was: Major)

> Union node may evaluate all children even if limit is reached
> -------------------------------------------------------------
>
>                 Key: IMPALA-5671
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5671
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.9.0
>            Reporter: Henry Robinson
>            Priority: Trivial
>
> The loop inside {{UnionNode::GetNextMaterialized()}} does not break if the 
> limit has been reached. See 
> [here|https://github.com/apache/incubator-impala/blob/master/be/src/exec/union-node.cc#L193].
>  The only way the loop can be broken is if either the children are exhausted, 
> or the current row batch becomes full.
> If you have a union node with a limit of 1, and two children - the first of 
> which is very cheap to evaluate and returns one row, but the second is very 
> expensive - the union node will try to fill an entire row batch with rows, 
> and end up waiting on the second child, even though the node could be 
> finished after reading one row from the first child.
> The result is a query that takes much longer to complete than it should. 
> Here's an example:
> {code}
> with l as (select 1 from functional.alltypes group by month), r as
>           (select count(*) from lineitem a CROSS JOIN lineitem b)
>   SELECT * from l UNION ALL (select * from r) LIMIT 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to