[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827499#comment-16827499
 ] 

Stamatis Zampetakis commented on CALCITE-2969:
----------------------------------------------

Thanks again for pushing this forward [~danny0405]!

At the moment, I have some rather high level comments.

As far as I can see the PR assumes that Join and Correlate are not the same and 
I think it should remain like that for the moment. Given the fact that they are 
different a Join should never have correlated variables. This precondition 
should allow simplifying some parts of code where we have joins and we perform 
various checks for correlated variables.

At first I thought that EnumerableCorrelate is (or at least should) be renamed 
EnumerableNestedLoopJoin but now I have some second thoughts. Undeniable they 
are very close but the fact that we are setting variables in the one side of 
the join could help us classify it purely as Correlate. We could have an 
EnumerableNestedLoopJoin which rather than setting variables to the one side of 
the join, it performs the classic double for-loop and the operator extracts the 
necessary join attributes from both sides of the join to apply the join 
condition. I have the impression that EnumerableThetaJoin is very close to the 
EnumerableNestedLoopJoin that I describe above.

To sum up my suggestions are the following:
 * keep Join and Correlate separate and do not allow joins to have correlated 
variables;
 * retain EnumerableCorrelate without changes;
 * rename EnumerableThetaJoin to EnumerableNestedLoopJoin;

but let's see what the others have to say about this. I had a look in the PR 
and it seems that [~hyuan] and [~rubenql] also agree on this.

> Improve design of join-like relational expressions
> --------------------------------------------------
>
>                 Key: CALCITE-2969
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2969
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Danny Chan
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to