[ 
https://issues.apache.org/jira/browse/CALCITE-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiong duan reassigned CALCITE-7371:
-----------------------------------

    Assignee: xiong duan

> Expand Correlate Constructor to Support Conditions and VariablesSet
> -------------------------------------------------------------------
>
>                 Key: CALCITE-7371
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7371
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: xiong duan
>            Assignee: xiong duan
>            Priority: Major
>
> Currently, in a Join, if there is a reference to LHS columns in the RHS, it 
> is directly converted to a Correlate.
> For example:
> {code:java}
> SELECT
>     d.deptno,
>     e.ename
> FROM dept d
> LEFT JOIN LATERAL (
>     SELECT
>         ename
>     FROM emp
>     WHERE emp.deptno = d.deptno
>       AND emp.job = 'MANAGER'
>  ) e {code}
> If there is a Correlated sub-query in the Join's Condition, it is converted 
> to a Join with variablesSet. For example:
> {code:java}
> SELECT empno
>       FROM emp AS e
>       LEFT JOIN dept AS d
>         ON d.dname = e.ename
>           AND (EXISTS (
>             SELECT e2.deptno FROM emp AS e2
>             WHERE e2.deptno = e.deptno
>             GROUP BY e2.deptno
>             HAVING SUM(e2.sal) > 1000000)); {code}
> I understand it as such, and the current code is implemented accordingly. 
> However, in the comments of the Join creation method, the definition of 
> variablesSet is consistent with the meaning of CorrelationId in Correlate 
> (Set variables that are set by the LHS and used by the RHS and are not 
> available to nodes above this Join in the tree). So I think it is a bug in 
> code comment.
> But when a query has both references to LHS columns in the RHS and additional 
> Correlated sub-queries in the condition, the current Calcite does not support 
> it.
> For example:
> {code:java}
> SELECT
>     d.deptno
> FROM dept d
> LEFT JOIN LATERAL (
>     SELECT
>         ename
>     FROM emp
>     WHERE emp.deptno = d.deptno
>       AND emp.job = 'MANAGER'
>  ) e ON EXISTS(SELECT * FROM bonus WHERE d.dname = e.ename) {code}
> After this PR, the corresponding logical plan can be represented as:
> {code:java}
> LogicalProject(DEPTNO=[$0], ENAME=[$3], SAL=[$4], DEPTNO0=[$5], JOB=[$6])
>   LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}], variablesSet=[$cor1], condition=[EXISTS({
> LogicalFilter(condition=[=($cor1.DNAME, CAST($cor1.ENAME):VARCHAR(14))])
>   JdbcTableScan(table=[[JDBC_SCOTT, BONUS]])
> })])
>     JdbcTableScan(table=[[JDBC_SCOTT, DEPT]])
>     LogicalProject(ENAME=[$1], SAL=[$5], DEPTNO=[$7], JOB=[$2])
>       LogicalFilter(condition=[AND(=($7, $cor0.DEPTNO), =($2, 'MANAGER'))])
>         JdbcTableScan(table=[[JDBC_SCOTT, EMP]]) {code}
> Additionally, the extended Correlate can handle references in the condition 
> well during the RelToSql process. First, construct the LeftResult, and 
> populate correlation variables based on the Context in LeftResult (these 
> variables may be referenced in RightResult; otherwise, an exception 
> indicating the variable name does not exist will be thrown). Then, construct 
> the RightResult, populate variablesSet variables based on the Context in 
> RightResult and LeftResult, and then construct the complete Correlate 
> condition, thereby constructing the Correlate.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to