[ 
https://issues.apache.org/jira/browse/CALCITE-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639030#comment-14639030
 ] 

Aman Sinha commented on CALCITE-714:
------------------------------------

Pull request: 
https://github.com/apache/incubator-calcite/pull/110

I addressed the failures in the unit tests: the ones related to 
LogicalCorrelate still being present were addressed by having the 
FilterProjectTransposeRule check for presence of correlation in the Filter and 
if yes, not push below Project.  
The other plan changes in RelOptRulesTest were expected based on the previous 
patch and I had not updated them earlier. 
I also removed the classFiltersForCorrelation as suggested by Julian and used 
existing method. 

> Decorrelation does not push join condition into subquery
> --------------------------------------------------------
>
>                 Key: CALCITE-714
>                 URL: https://issues.apache.org/jira/browse/CALCITE-714
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.1.0-incubating
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>             Fix For: next
>
>
> When decorrelating a scalar aggregate subquery,  we push copies of the tables 
> from the outer query block into the subquery but don't push the join 
> condition, thereby creating a cartesian join.  This seems to be a regression. 
> Query: 
> {code}
> select count(*) 
>     from dfs.`/Users/asinha/data/tpch-sf1/nation` n, 
> dfs.`/Users/asinha/data/tpch-sf1/orders` o 
>          where n.n_nationkey = o.o_orderkey 
>          and n.n_nationkey > (select avg(ps.ps_suppkey) from 
> dfs.`/Users/asinha/data/tpch-sf1/partsupp` ps
>                                where n.n_regionkey = ps.ps_partkey);
> {code}
> Here's the plan on an earlier version of Calcite (I am not sure of the 
> version number but it was before the decorrelation refactoring),  Note the 
> join between nation and orders at the leaf level has a equi-join condition.  
> {code} 
> AggregateRel(group=[{}], EXPR$0=[COUNT()])
>   ProjectRel($f0=[$0])
>     ProjectRel($f0=[0])
>       FilterRel(condition=[>($1, $5)])
>         ProjectRel(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], 
> o_orderkey=[$4], EXPR$0=[$6])
>           JoinRel(condition=[=($2, $5)], joinType=[left])
>             JoinRel(condition=[=($1, $4)], joinType=[inner])
>               EnumerableTableAccessRel(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/nation]])
>               EnumerableTableAccessRel(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/orders]])
>             AggregateRel(group=[{0}], EXPR$0=[AVG($1)])
>               ProjectRel($f0=[$1], ps_suppkey=[$0])
>                 ProjectRel(ps_suppkey=[$2], $f0=[$3])
>                   FilterRel(condition=[=($3, $1)])
>                     JoinRel(condition=[true], joinType=[inner])
>                       EnumerableTableAccessRel(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/partsupp]])
>                       AggregateRel(group=[{0}])
>                         ProjectRel($f0=[$2])
>                           JoinRel(condition=[=($1, $4)], joinType=[inner])
>                             EnumerableTableAccessRel(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/nation]])
>                             EnumerableTableAccessRel(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/orders]])
> {code}
> Here's the new plan (I am on version 1.1.0 but I think the plan has not 
> changed in the latest version).  Note the join between nation and orders at 
> the leaf level does not have any join condition. 
> {code}
> LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
>   LogicalProject($f0=[$0])
>     LogicalProject($f0=[0])
>       LogicalFilter(condition=[AND(=($1, $4), >($1, $5))])
>         LogicalProject(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], 
> o_orderkey=[$4], EXPR$0=[$6])
>           LogicalJoin(condition=[=($2, $5)], joinType=[left])
>             LogicalJoin(condition=[true], joinType=[inner])
>               EnumerableTableScan(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/nation]])
>               EnumerableTableScan(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/orders]])
>             LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
>               LogicalProject(n_regionkey=[$1], ps_suppkey=[$0])
>                 LogicalProject(ps_suppkey=[$2], n_regionkey=[$3])
>                   LogicalFilter(condition=[=($3, $1)])
>                     LogicalJoin(condition=[true], joinType=[inner])
>                       EnumerableTableScan(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/partsupp]])
>                       LogicalAggregate(group=[{0}])
>                         LogicalProject(n_regionkey=[$2])
>                           LogicalJoin(condition=[true], joinType=[inner])
>                             EnumerableTableScan(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/nation]])
>                             EnumerableTableScan(table=[[dfs, 
> /Users/asinha/data/tpch-sf1/orders]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to