[ 
https://issues.apache.org/jira/browse/CALCITE-4617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353540#comment-17353540
 ] 

Julian Hyde commented on CALCITE-4617:
--------------------------------------

[~rubenql], No, I meant "left" in query 2. It is a different case because of 
the uniqueness of {{emp.deptno}}.

Now consider query 0,
{code:sql}
select d.deptno, empno
    from sales.dept d
    right join sales.emp e using (deptno)
    offset 1
{code}
which is basically the query that this case is about. Suppose that {{emp}} has 
(empno, deptno) rows (1, 30), (2, 10) and {{dept}} has (deptno) rows (20), (30).

The query should return either of the rows (20, null) and (30, 1) and (skipping 
the other row because of the offset). But with this transformation, it will 
skip the {{emp}} row (1, 30), and therefore may wrongly emit a result of (30, 
null).

In short, this transformation is not valid. I do agree that the bug fix makes 
an invalid transformation slightly better. But we haven't finished the 
analysis, so we're not "done".

I could be persuaded that this fix could go into 1.27 only if we have a valid 
test case.

> Wrong offset when SortJoinTransposeRule pushes a sort node with an offset
> -------------------------------------------------------------------------
>
>                 Key: CALCITE-4617
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4617
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.26.0
>            Reporter: Sylvain Crozon
>            Assignee: Ruben Q L
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.27.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The SortJoinTransposeRule will push a sort node past a join, and then 
> duplicate the sort node on top of the join. When the sort node has an offset, 
> we should only apply it once, otherwise we end up skipping twice as many rows 
> as we should. The sort node added on top of the join should have a null 
> offset.
>  
> For example the testSortJoinTranspose6 test checks that for this initial plan
> {code}
> LogicalProject(DEPTNO=[$0], EMPNO=[$2])
>   LogicalSort(offset=[2], fetch=[10])
>     LogicalJoin(condition=[=($0, $9)], joinType=[right])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> the SortJoinTransposeRule should convert to
> {code}
> LogicalProject(DEPTNO=[$0], EMPNO=[$2])
>   LogicalSort(offset=[2], fetch=[10])
>     LogicalJoin(condition=[=($0, $9)], joinType=[right])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>       LogicalSort(offset=[2], fetch=[10])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> Which will result in applying the offset twice. Instead the LogicalSort on 
> top of the join should just have a null offset



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to