[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839965#comment-16839965
 ] 

Lai Zhou edited comment on CALCITE-2973 at 5/15/19 7:26 AM:
------------------------------------------------------------

[~rubenql], now the inner join with a remainCondtion won't be converted to an 
inner-join plus a filter , the Enumerable(Hash)Join can handle it in a generic 
way.

But some of tests are failed, may introduce a bug after removing the filter. 
I'll check the problems .

[~zabetak],[~rubenql]

I found a problem about inner join, please checkout the current codebase, see  
the test case:

JdbcAdapterTest.testScalarSubQuery,

for the non-equi join sql
{code:java}
CalciteAssert.model(JdbcTest.SCOTT_MODEL) .query("SELECT COUNT(ename) AS cEname 
FROM \"SCOTT\".\"EMP\" " + "WHERE DEPTNO > (SELECT deptno FROM 
\"SCOTT\".\"DEPT\" " + "WHERE dname = 'ACCOUNTING')") .enable(CalciteAssert.DB 
== CalciteAssert.DatabaseInstance.HSQLDB) .returns("CENAME=11\n");
{code}
Before, the generated plan was that:

 
{code:java}
EnumerableAggregate(group=[{}], CENAME=[COUNT($1)]) 
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[>($t2, $t0)], proj#0..2=[{exprs}], 
$condition=[$t3]) EnumerableJoin(condition=[true], joinType=[inner], 
remainCondition=[>($2, $0)]) EnumerableAggregate(group=[{}], 
agg#0=[SINGLE_VALUE($0)]) JdbcToEnumerableConverter JdbcFilter(condition=[=($1, 
'ACCOUNTING')]) JdbcTableScan(table=[[SCOTT, DEPT]]) JdbcToEnumerableConverter 
JdbcProject(ENAME=[$1], DEPTNO=[$7]) JdbcTableScan(table=[[SCOTT, EMP]])
{code}
 

After replacing the filter by a remainCondition, the planner find a semi-join 
based plan as the best plan ,but which was a bad plan.

 
{code:java}
EnumerableAggregate(group=[{}], CENAME=[COUNT($0)]) 
EnumerableSemiJoin(condition=[true], joinType=[inner]) 
JdbcToEnumerableConverter JdbcProject(ENAME=[$1], DEPTNO=[$7]) 
JdbcTableScan(table=[[SCOTT, EMP]]) JdbcToEnumerableConverter 
JdbcFilter(condition=[=($1, 'ACCOUNTING')]) JdbcTableScan(table=[[SCOTT, DEPT]])
{code}
The condition of this logical SemiJoin was true .

Could you help me to identify this problem ?

 


was (Author: hhlai1990):
[~rubenql], now the inner join with a remainCondtion won't be converted to an 
inner-join plus a filter , the Enumerable(Hash)Join can handle it in a generic 
way.

But some of tests are failed, may introduce a bug after dropping the filter. 
I'll check the problems .

> Allow theta joins that have equi conditions to be executed using a hash join 
> algorithm
> --------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2973
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2973
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Lai Zhou
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.20.0
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 10000*10000), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to