[
https://issues.apache.org/jira/browse/CALCITE-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808443#comment-16808443
]
Ruben Quesada Lopez commented on CALCITE-2909:
----------------------------------------------
[~sereda], thanks for taking care of the PR, I guess this ticket can be closed
as resolved, right?
> Optimize Enumerable SemiJoin with lazy computation of innerLookup
> -----------------------------------------------------------------
>
> Key: CALCITE-2909
> URL: https://issues.apache.org/jira/browse/CALCITE-2909
> Project: Calcite
> Issue Type: Improvement
> Affects Versions: 1.18.0
> Reporter: Ruben Quesada Lopez
> Assignee: Ruben Quesada Lopez
> Priority: Major
> Labels: pull-request-available
> Time Spent: 3h
> Remaining Estimate: 0h
>
> The implementation of semiJoin in EnumerableDefaults.java is based on two
> elements: an outer enumerator and an inner lookup. The method "returns
> elements of outer for which there is a member of inner with a matching key".
> In order to achieve that, the innerLookup is always eagerly computed, even
> though in some cases it might be not necessary at all: when the outer
> enumerator returns no element there is no need for the innerLookup. In a
> worst case scenario, a time-consuming innerLookup computation combined with
> an empty outer enumerator will lead to an inefficient execution, which could
> have been easily avoided.
> In order to improve that, it is proposed to delay the computation of the
> innerLookup until the moment when we are sure that it will be really needed,
> i.e. when the first outer enumerator item is processed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)