[
https://issues.apache.org/jira/browse/IMPALA-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-5671:
----------------------------------
Priority: Trivial (was: Major)
> Union node may evaluate all children even if limit is reached
> -------------------------------------------------------------
>
> Key: IMPALA-5671
> URL: https://issues.apache.org/jira/browse/IMPALA-5671
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.9.0
> Reporter: Henry Robinson
> Priority: Trivial
>
> The loop inside {{UnionNode::GetNextMaterialized()}} does not break if the
> limit has been reached. See
> [here|https://github.com/apache/incubator-impala/blob/master/be/src/exec/union-node.cc#L193].
> The only way the loop can be broken is if either the children are exhausted,
> or the current row batch becomes full.
> If you have a union node with a limit of 1, and two children - the first of
> which is very cheap to evaluate and returns one row, but the second is very
> expensive - the union node will try to fill an entire row batch with rows,
> and end up waiting on the second child, even though the node could be
> finished after reading one row from the first child.
> The result is a query that takes much longer to complete than it should.
> Here's an example:
> {code}
> with l as (select 1 from functional.alltypes group by month), r as
> (select count(*) from lineitem a CROSS JOIN lineitem b)
> SELECT * from l UNION ALL (select * from r) LIMIT 2
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]