[
https://issues.apache.org/jira/browse/CALCITE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17398769#comment-17398769
]
Konstantin Orlov commented on CALCITE-4542:
-------------------------------------------
Hi, [~rkondakov]!
I certainly will create the PR, but let's dig to the truth first.
I've done more research on the topic and have found that the root cause,
probably, the
org.apache.calcite.plan.volcano.TopDownRuleDriver#getOptimizeInputTask method.
In case the rel's traits don't satisfy the group's traits, we convert the rel
first, and then we schedule DeriveTask. The problem here that we cache the
converted rel in passThroughCache (not a problem actually), but then we replace
the original rel with converted one, so the task is scheduled for the converted
rel. And when the task is executed, it simply does nothing, because the
converted rel is in passThroughCache.
> Suboptimal plan is chosen when TopDownRuleDriver is enabled
> ------------------------------------------------------------
>
> Key: CALCITE-4542
> URL: https://issues.apache.org/jira/browse/CALCITE-4542
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.26.0
> Reporter: Konstantin Orlov
> Priority: Major
> Attachments: dump.txt
>
>
> When TopDownRuleDriver is enabled, suboptimal plan is chosen for query with
> join.
> We have our own convention and implementation of all necessary relations. A
> distributed join is considered less expensive by our cost system than a
> single-distributed join, and the merge join is considered less expensive than
> nested loop if index over join condition is present. Nevertheless the merge
> join with a single distribution is chosen by optimizer.
> The query is:
> {code:java}
> select e1."empid", e1."deptno" from "emps" e1 join "emps" e2 on e1."empid" =
> e2."empid"
> {code}
> Actual plan is:
> {code:java}
> MyProject(subset=[rel#16:RelSubset#2.MY.single.[]], empid=[$0], deptno=[$1]):
> rowcount = 1500.0, cumulative cost = {1500.0 rows, 3000.0 cpu, 0.0 io}, id =
> 21
> MyMergeJoin(subset=[rel#20:RelSubset#1.MY.single.[]], condition=[=($0,
> $5)], joinType=[inner]): rowcount = 1500.0, cumulative cost = {150.0 rows,
> 0.0 cpu, 0.0 io}, id = 50
> MyExchange(subset=[rel#25:RelSubset#0.MY.single.[]],
> distribution=[single]): rowcount = 100.0, cumulative cost =
> {9210.340371976183 rows, 100.0 cpu, 0.0 io}, id = 30
> MyTableScan(subset=[rel#29:RelSubset#0.MY.any.[0]], table=[[hr,
> emps]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 27
> MyExchange(subset=[rel#25:RelSubset#0.MY.single.[]],
> distribution=[single]): rowcount = 100.0, cumulative cost =
> {9210.340371976183 rows, 100.0 cpu, 0.0 io}, id = 30
> MyTableScan(subset=[rel#29:RelSubset#0.MY.any.[0]], table=[[hr,
> emps]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 27
> {code}
> Expected plan is:
> {code:java}
> MyProject(subset=[rel#16:RelSubset#2.MY.single.[]], empid=[$0], deptno=[$1]):
> rowcount = 1500.0, cumulative cost = {1500.0 rows, 3000.0 cpu, 0.0 io}, id =
> 21
> MyExchange(subset=[rel#20:RelSubset#1.MY.single.[]],
> distribution=[single]): rowcount = 1500.0, cumulative cost =
> {18420.680743952365 rows, 100.0 cpu, 0.0 io}, id = 24
> MyMergeJoin(subset=[rel#23:RelSubset#1.MY.any.[]], condition=[=($0, $5)],
> joinType=[inner]): rowcount = 1500.0, cumulative cost = {0.15 rows, 0.0 cpu,
> 0.0 io}, id = 50
> MyTableScan(subset=[rel#26:RelSubset#0.MY.hash[0].[0]], table=[[hr,
> emps]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 25
> MyTableScan(subset=[rel#26:RelSubset#0.MY.hash[0].[0]], table=[[hr,
> emps]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 25
> {code}
> Planner [^dump.txt] doesn't contain the join with proper distribution.
> Reproducer could be found
> [here|https://github.com/korlov42/calcite/tree/derive-not-being-called-repoducer].
> Please run {{org.apache.calcite.tools.PlannerTest#test}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)