[
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839965#comment-16839965
]
Lai Zhou edited comment on CALCITE-2973 at 5/15/19 7:26 AM:
------------------------------------------------------------
[~rubenql], now the inner join with a remainCondtion won't be converted to an
inner-join plus a filter , the Enumerable(Hash)Join can handle it in a generic
way.
But some of tests are failed, may introduce a bug after removing the filter.
I'll check the problems .
[~zabetak],[~rubenql]
I found a problem about inner join, please checkout the current codebase, see
the test case:
JdbcAdapterTest.testScalarSubQuery,
for the non-equi join sql
{code:java}
CalciteAssert.model(JdbcTest.SCOTT_MODEL) .query("SELECT COUNT(ename) AS cEname
FROM \"SCOTT\".\"EMP\" " + "WHERE DEPTNO > (SELECT deptno FROM
\"SCOTT\".\"DEPT\" " + "WHERE dname = 'ACCOUNTING')") .enable(CalciteAssert.DB
== CalciteAssert.DatabaseInstance.HSQLDB) .returns("CENAME=11\n");
{code}
Before, the generated plan was that:
{code:java}
EnumerableAggregate(group=[{}], CENAME=[COUNT($1)])
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[>($t2, $t0)], proj#0..2=[{exprs}],
$condition=[$t3]) EnumerableJoin(condition=[true], joinType=[inner],
remainCondition=[>($2, $0)]) EnumerableAggregate(group=[{}],
agg#0=[SINGLE_VALUE($0)]) JdbcToEnumerableConverter JdbcFilter(condition=[=($1,
'ACCOUNTING')]) JdbcTableScan(table=[[SCOTT, DEPT]]) JdbcToEnumerableConverter
JdbcProject(ENAME=[$1], DEPTNO=[$7]) JdbcTableScan(table=[[SCOTT, EMP]])
{code}
After replacing the filter by a remainCondition, the planner find a semi-join
based plan as the best plan ,but which was a bad plan.
{code:java}
EnumerableAggregate(group=[{}], CENAME=[COUNT($0)])
EnumerableSemiJoin(condition=[true], joinType=[inner])
JdbcToEnumerableConverter JdbcProject(ENAME=[$1], DEPTNO=[$7])
JdbcTableScan(table=[[SCOTT, EMP]]) JdbcToEnumerableConverter
JdbcFilter(condition=[=($1, 'ACCOUNTING')]) JdbcTableScan(table=[[SCOTT, DEPT]])
{code}
The condition of this logical SemiJoin was true .
Could you help me to identify this problem ?
was (Author: hhlai1990):
[~rubenql], now the inner join with a remainCondtion won't be converted to an
inner-join plus a filter , the Enumerable(Hash)Join can handle it in a generic
way.
But some of tests are failed, may introduce a bug after dropping the filter.
I'll check the problems .
> Allow theta joins that have equi conditions to be executed using a hash join
> algorithm
> --------------------------------------------------------------------------------------
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.19.0
> Reporter: Lai Zhou
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.20.0
>
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query for a large dataset (such as 10000*10000),
> the nested-loop join process will take dozens of time than the sort-merge
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will
> improve the performance greatly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)