[jira] [Commented] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

Julian Hyde (JIRA) Mon, 01 Apr 2019 19:52:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807372#comment-16807372
 ]


Julian Hyde commented on CALCITE-2973:
--------------------------------------

I agree. Merge join is well suited to joins with theta conditions that are 
based on ranges. (Hash joins are good with equality but terrible with ranges.) 
For example, consider the following query:

{code}select *
from orders
join shipments
on shipment.shipDate
  between order.orderDate + interval '1' day
  and order.orderDate + interval '2' day{code}

An efficient execution plan would sort {{orders}} on {{orderDate}} and 
{{shipments}} on {{shipDate}} and merge {{orders}} against a 1-day range of 
{{shipments}}. It is a generalization of merge join.

> Make EnumerableMergeJoinRule to support a theta join
> ----------------------------------------------------
>
>                 Key: CALCITE-2973
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2973
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Lai Zhou
>            Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 10000*10000), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

Reply via email to