[
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818940#comment-16818940
]
Stamatis Zampetakis commented on CALCITE-2973:
----------------------------------------------
In terms of code re-use, it would seem more natural to treat only the equality
condition part in the join and leave the remaining condition to be treated
afterwards. As Julian mentioned when there are outer joins involved, the filter
cannot be applied after the join but I have the impression that a projection
could achieve the same result (i.e., nullify the left/right part when a certain
condition holds). The additional benefit is that if we could break a theta join
into an equijoin plus filter/projection (using a rule) this could be exploited
by more users.
In terms of semantics, having the join operator do all the job is more
intuitive and the plan is easier to understand so in the end I haven't made up
my mind what is the best approach.
> Allow theta joins that have equi conditions to be executed using a hash join
> algorithm
> --------------------------------------------------------------------------------------
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.19.0
> Reporter: Lai Zhou
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query for a large dataset (such as 10000*10000),
> the nested-loop join process will take dozens of time than the sort-merge
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will
> improve the performance greatly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)