[
https://issues.apache.org/jira/browse/CALCITE-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280861#comment-14280861
]
Jinfeng Ni edited comment on CALCITE-559 at 1/16/15 9:36 PM:
-------------------------------------------------------------
Maybe I miss something, but I just do not understand why after de-correlation,
the table inside the correlated subquery will be joined with both the two
tables in the outer query block. Essentially, the query looks like
{code}
E join D
where E.id = D.id and E.col2 < ( Select avg(E2.col2) from E2 where
E1.col1 = E2.col1);
{code}
The plan after de-correlation will have two appearances for E join D. The
second is the one pushed into subQ and joined with E2. In other words, if we
have n tables join in the outer query block, are we duplicating the n tables
join in the de-correlated plan? Performance-wise, is it optimal to duplicate
those n table joins in the subquery? Is it good enough to just push the
correlated table (E) into the subq, do the join?
{code}
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
EnumerableTableScan(table=[[foodmart2, department]])
LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
LogicalProject($f0=[$1], salary=[$0])
LogicalProject(salary=[$11], $f0=[$17])
LogicalFilter(condition=[=($6, $17)])
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
LogicalAggregate(group=[{0}])
LogicalProject($f0=[$6])
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
EnumerableTableScan(table=[[foodmart2, department]])
{code}
Is it possible for Calcite's decorrelator to rewrite the query into (first
join E with E2, then join against D) ?
{code}
Select ..
From (
S E.id, ...
From E, (select E2.col1, avg(E2.col2)
From E2 Group by E2.col1) X (Col1, avgval)
where E.col1 = X.col1 and E.col2 < X.avgval
) Y,
D
Where Y.id = D.id;
{code}
was (Author: jni):
Maybe I miss something, but I just do not understand why after de-correlation,
the table inside the correlated subquery will be joined with both the two
tables in the outer query block. Essentially, the query looks like
{code}
E join D
where E.id = D.id and E.col2 < ( Select avg(E.col2) from E2 where E1.col1
= E2.col1);
{code}
The plan after de-correlation will have two appearances for E join D. The
second is the one pushed into subQ and joined with E2. In other words, if we
have n tables join in the outer query block, are we duplicating the n tables
join in the de-correlated plan? Performance-wise, is it optimal to duplicate
those n table joins in the subquery? Is it good enough to just push the
correlated table (E) into the subq, do the join?
{code}
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
EnumerableTableScan(table=[[foodmart2, department]])
LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
LogicalProject($f0=[$1], salary=[$0])
LogicalProject(salary=[$11], $f0=[$17])
LogicalFilter(condition=[=($6, $17)])
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
LogicalAggregate(group=[{0}])
LogicalProject($f0=[$6])
LogicalJoin(condition=[true], joinType=[inner])
EnumerableTableScan(table=[[foodmart2, employee]])
EnumerableTableScan(table=[[foodmart2, department]])
{code}
Is it possible for Calcite's decorrelator to rewrite the query into (first
join E with E2, then join against D) ?
{code}
Select ..
From (
S E.id, ...
From E, (select E2.col1, avg(E2.col2)
From E2 Group by E2.col1) X (Col1, avgval)
where E.col1 = X.col1 and E.col2 < X.avgval
) Y,
D
Where Y.id = D.id;
{code}
> Correlated subquery will hit exception in Calcite
> -------------------------------------------------
>
> Key: CALCITE-559
> URL: https://issues.apache.org/jira/browse/CALCITE-559
> Project: Calcite
> Issue Type: Bug
> Reporter: Jinfeng Ni
> Assignee: Julian Hyde
>
> Calcite will throw exception, for a query with a correlated subquery. I
> tried both sqlline and as a Junit test case inJDBCTest. Both of them hit
> exception:
> 1. sqlline
> {code}
> sqlline> !connect
> jdbc:calcite:model=core/src/test/resources/hsqldb-foodmart-lattice-model.json
> admin admin
> 0: jdbc:calcite:model=core/src/test/resources> select e."employee_id",
> d."department_id" from "foodmart"."employee" e, "foodmart"."department" d
> where e."department_id" = d."department_id" and e."salary" > (select
> avg(e2."salary") from "foodmart"."employee" e2 where e2."store_id" =
> e."store_id");
> Error: exception while executing query: while executing SQL [SELECT "$f0"
> FROM (SELECT "employee"."store_id" AS "$f0"
> FROM "foodmart"."department"
> INNER JOIN "foodmart"."employee") AS "t"
> GROUP BY "$f0"] (state=,code=0)
> {code}
> 2. As a Junit testcase:
> {code}
> @Test public void testJoinCorreScalarSubQ()
> throws ClassNotFoundException, SQLException {
> CalciteAssert.that()
> .with(CalciteAssert.Config.FOODMART_CLONE)
> .query("select e.\"employee_id\", d.\"department_id\" "
> + " from \"employee\" e, \"department\" d "
> + " where e.\"department_id\" = d.\"department_id\" and "
> + " e.\"salary\" > (select avg(e2.\"salary\") "
> + " from \"employee\" e2 "
> + " where e2.\"store_id\" = e.\"store_id\")"
> ).returnsCount(0);
> }
> {code}
> Caused by: java.lang.RuntimeException: With materializationsEnabled=false,
> limit=0
> at
> org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:461)
> at
> org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1159)
> ... 29 more
> Caused by: java.lang.AssertionError: Internal error: Error while applying
> rule EnumerableJoinRule, args
> [rel#160:LogicalJoin.NONE.[](left=rel#75:Subset#0.ENUMERABLE.[],right=rel#93:Subset#9.NONE.[],condition=AND(=($6,
> $17), >($11, $18)),joinType=inner)]
> at org.apache.calcite.util.Util.newInternal(Util.java:739)
> at
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
> at ...
> ... 30 more
> Caused by: java.lang.AssertionError: Internal error: Error occurred while
> applying rule EnumerableJoinRule
> at org.apache.calcite.util.Util.newInternal(Util.java:739)
> at ... 44 more
> Caused by: java.lang.AssertionError: type mismatch:
> ref:
> DECIMAL(10, 4)
> input:
> DECIMAL(10, 4) NOT NULL
> at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1608)
> at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:120)
> at .....
> {code}
> The sqlline run seems to hit exception during enumerable execution, while the
> Junit run seems to hit exception in planning phase. Probably, that's because
> sqlline by default does not turn on assertion check.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)