[
https://issues.apache.org/jira/browse/IGNITE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494024#comment-17494024
]
Aleksey Plekhanov commented on IGNITE-16493:
--------------------------------------------
Looks like there are some problems in decorrelation procedure in Calcite, for
example, query:
{noformat}
CREATE TABLE integers(i INTEGER) WITH TEMPLATE=REPLICATED;
INSERT INTO integers VALUES (1), (2), (3), (NULL);
SELECT i, EXISTS(SELECT i FROM integers WHERE i IS NULL OR i>i1.i*10) FROM
integers i1;{noformat}
Produces plan like:
{noformat}
IgniteProject(I=[$2], EXPR$1=[IS NOT NULL($5)]): rowcount = 4.15, cumulative
cost = IgniteCost [rowCount=37.15, cpu=73.15, memory=28.0, io=0.0,
network=0.0], id = 3318
IgniteNestedLoopJoin(condition=[=($3, $4)], joinType=[left],
variablesSet=[[]]): rowcount = 4.15, cumulative cost = IgniteCost
[rowCount=33.0, cpu=69.0, memory=28.0, io=0.0, network=0.0], id = 3317
IgniteTableScan(table=[[PUBLIC, INTEGERS]], projects=[[$t0, $t1, $t2,
*($t2, 10)]], requiredColumns=[{0, 1, 2}]): rowcount = 4.0, cumulative cost =
IgniteCost [rowCount=4.0, cpu=4.0, memory=0.0, io=0.0, network=0.0], id = 1048
IgniteProject($f3=[$0], $f1=[true]): rowcount = 1.0, cumulative cost =
IgniteCost [rowCount=25.0, cpu=49.0, memory=20.0, io=0.0, network=0.0], id =
3316
IgniteSingleHashAggregate(group=[{0}]): rowcount = 1.0, cumulative cost =
IgniteCost [rowCount=24.0, cpu=48.0, memory=20.0, io=0.0, network=0.0], id =
3315
IgniteProject($f3=[$3]): rowcount = 2.0, cumulative cost = IgniteCost
[rowCount=22.0, cpu=46.0, memory=16.0, io=0.0, network=0.0], id = 3314
IgniteNestedLoopJoin(condition=[OR(IS NULL($2), >($2, $3))],
joinType=[inner], variablesSet=[[]]): rowcount = 2.0, cumulative cost =
IgniteCost [rowCount=20.0, cpu=44.0, memory=16.0, io=0.0, network=0.0], id =
3313
IgniteTableScan(table=[[PUBLIC, INTEGERS]]): rowcount = 4.0,
cumulative cost = IgniteCost [rowCount=4.0, cpu=4.0, memory=0.0, io=0.0,
network=0.0], id = 269
IgniteSingleHashAggregate(group=[{0}]): rowcount = 2.0, cumulative
cost = IgniteCost [rowCount=8.0, cpu=8.0, memory=8.0, io=0.0, network=0.0], id
= 3312
IgniteTableScan(table=[[PUBLIC, INTEGERS]], projects=[[*($t0,
10)]], requiredColumns=[{2}]): rowcount = 4.0, cumulative cost = IgniteCost
[rowCount=4.0, cpu=4.0, memory=0.0, io=0.0, network=0.0], id = 282{noformat}
In this plan {{IgniteNestedLoopJoin(condition=[=($3, $4)], joinType=[left])}}
filter out {{null}} values derived from the right hand and return wrong result
(\{{(null, false)}} instead of {{{}(null, true){}}})
> Calcite engine. Decorrelation after subquery rewrite
> ----------------------------------------------------
>
> Key: IGNITE-16493
> URL: https://issues.apache.org/jira/browse/IGNITE-16493
> Project: Ignite
> Issue Type: Improvement
> Reporter: Aleksey Plekhanov
> Priority: Major
> Labels: calcite, calcite2-required, calcite3-required
>
> Currently decorrelation is performed in {{SqlToRelConverter}}, but after this
> {{PlannerPhase.HEP_DECORRELATE}} planning phase is executed (which actually
> rewrites subqueries into correlates, but doesn't perform decorrelation). In
> some cases, other types of join can cost less than correlated nested loop
> join, so we can try to decorrelate the query plan again after
> {{PlannerPhase.HEP_DECORRELATE}} phase (call
> {{RelDecorrelator.decorrelateQuery}}).
> {{PlannerPhase.HEP_DECORRELATE}} should be also renamed to something like
> "HEP_REWRITE_SUBQUERY".
--
This message was sent by Atlassian Jira
(v8.20.1#820001)