[ 
https://issues.apache.org/jira/browse/CALCITE-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939342#comment-16939342
 ] 

Ruben Q L edited comment on CALCITE-3376 at 9/27/19 11:36 AM:
--------------------------------------------------------------

After some debugging, _I think I know_ what is going on. Taken the example from 
the description, during the optimization phase we have this situation:
 !Diagram.png|width=579,height=434! 
 Note that the Union is both left input of the Correlate and input of the 
Filter.
 At some point, when the relevant rules have been applied, the planner starts 
getting costs that are better than the initial ones (infinite), see 
{{RelSubset#propagateCostImprovements}}. As we can see in the attached file 
logsTRACE.txt, firstly it happens with the scan (rel#23):
{code:java}
Subset cost changed: subset [rel#23:Subset#0.ENUMERABLE.[0]] cost was {inf} now 
{14.0 rows, 15.0 cpu, 0.0 io}
{code}
Then, this cost change is propagated to its parent, i.e. the Union (rel#19):
{code:java}
Subset cost changed: subset [rel#19:Subset#1.ENUMERABLE.[]] cost was {inf} now 
{56.0 rows, 58.0 cpu, 0.0 io}
{code}
Then, it continues the propagation to the Union's parents, which are two: the 
Correlate and the Filter (both still with infinite cost). First it goes to the 
Correlate (rel#13), but when re-computing its cost, there is no change, it is 
still infinite because even though its left child (Union) has a valid cost, its 
right child (Filter) has infinite cost (yet). Next, the next Union's parent is 
processed: the Filter (rel#20), and its cost is improved:
{code:java}
Subset cost changed: subset [rel#20:Subset#3.ENUMERABLE.[]] cost was {inf} now 
{60.2 rows, 86.0 cpu, 0.0 io}
{code}
Finally, the propagation continues to the Filter's parent: the Correlate, so 
again its cost is (re)evaluated. Unlike the previous calculation, at this point 
the Correlate will have a valid cost, since both children (Union and Filter) 
have valid costs. However, when computing its cost, it is still infinite, that 
leads to the CannotPlanException. Why it is still infinite? The debugger shows 
that this second computation does not actually reach 
{{Correlate#computeSelfCost}}, in fact, it would seem that the first cost 
computation (infinite) was "cached" by the RelMetadataQuery, so it (wrongly) 
returns its cached value (infinite) when computing the Correlate cost the 
second time. Unfortunately, this seems a bad decision, because an actual 
computation of {{Correlate#computeSelfCost}} would have returned a valid cost 
value this time.


was (Author: rubenql):
After some debugging, _I think I know_ what is going on. Taken the example from 
the description, during the optimization phase we have this situation:
 !Diagram.png! 
Note that the Union is both left input of the Correlate and input of the Filter.
At some point, when the relevant rules have been applied, the planner starts 
getting costs that are better than the initial ones (infinite), see 
{{RelSubset#propagateCostImprovements}}. As we can see in the attached file 
logsTRACE.txt, firstly it happens with the scan (rel#23):
{code}
Subset cost changed: subset [rel#23:Subset#0.ENUMERABLE.[0]] cost was {inf} now 
{14.0 rows, 15.0 cpu, 0.0 io}
{code}
Then, this cost change is propagated to its parent, i.e. the Union (rel#19):
{code}
Subset cost changed: subset [rel#19:Subset#1.ENUMERABLE.[]] cost was {inf} now 
{56.0 rows, 58.0 cpu, 0.0 io}
{code}
Then, it continues the propagation to the Union's parents, which are two: the 
Correlate and the Filter (both still with infinite cost). First it goes to the 
Correlate (rel#13), but when re-computing its cost, there is no change, it is 
still infinite because even though its left child (Union) has a valid cost, its 
right child (Filter) has infinite cost (yet). Next, the next Union's parent is 
processed: the Filter (rel#20), and its cost is improved:
{code}
Subset cost changed: subset [rel#20:Subset#3.ENUMERABLE.[]] cost was {inf} now 
{60.2 rows, 86.0 cpu, 0.0 io}
{code}
Finally, the propagation continues to the Filter's parent: the Correlate, so 
again its cost is (re)evaluated. Unlike the previous calculation, at this point 
the Correlate will have a valid cost, since both children (Union and Filter) 
have valid costs. However, when computing its cost, it is still infinite, that 
leads to the CannotPlanException. Why it is still infinite? The debugger shows 
that this second computation does not actually reach 
{{Correlate#computeSelfCost}}, in fact, it would seem that the first cost 
computation (infinite) was "cached" by the RelMetadataQuery, so it (wrongly) 
returns its cached value (infinite) when computing the Correlate cost the 
second time. Unfortunately, this seems a bad decision, because an actual 
computation of {{Correlate#computeSelfCost}} would have returned a valid cost 
value this time.

> VolcanoPlanner CannotPlanException: best rel is null even though there is an 
> option with non-infinite cost
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3376
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3376
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.21.0
>            Reporter: Ruben Q L
>            Priority: Major
>         Attachments: Diagram.png, Graphviz.png, logsTRACE.txt, stackTrace.txt
>
>
> The problem can be reproduced by adding this test to PlannerTest.java:
> {code:java}
>   @Test public void testCannotPlanException() throws Exception {
>     RelBuilder builder = RelBuilder.create(RelBuilderTest.config().build());
>     RuleSet ruleSet =
>         RuleSets.ofList(
>             //EnumerableRules.ENUMERABLE_JOIN_RULE, // with this rule it 
> works!
>             JoinToCorrelateRule.INSTANCE,
>             EnumerableRules.ENUMERABLE_CORRELATE_RULE,
>             EnumerableRules.ENUMERABLE_PROJECT_RULE,
>             EnumerableRules.ENUMERABLE_FILTER_RULE,
>             EnumerableRules.ENUMERABLE_SORT_RULE,
>             EnumerableRules.ENUMERABLE_UNION_RULE,
>             EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE);
>     builder
>         .scan("EMP")
>         .scan("EMP")
>         .union(true)
>         .scan("EMP")
>         .scan("EMP")
>         .union(true)
>         .join(
>             JoinRelType.INNER,
>             builder.equals(
>                 builder.field(2, 0, "DEPTNO"),
>                 builder.field(2, 1, "EMPNO")));
>     RelNode relNode = builder.build();
>     RelOptPlanner planner = relNode.getCluster().getPlanner();
>     Program program = Programs.of(ruleSet);
>     RelTraitSet toTraits = relNode.getTraitSet()
>         .replace(EnumerableConvention.INSTANCE);
>     RelNode output = program.run(planner, relNode, toTraits,
>         ImmutableList.of(), ImmutableList.of());
>     String outputStr = toString(output);
>   }
> {code}
> Running this test causes the following exception (full stack trace attached):
> {code:java}
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not 
> enough rules to produce a node with desired properties: 
> convention=ENUMERABLE, sort=[]. All the inputs have relevant nodes, however 
> the cost is still infinite.
> Root: rel#13:Subset#2.ENUMERABLE.[]
> {code}
> The last part of the message (_All the inputs have relevant nodes, however 
> the cost is still infinite_) seems relevant, because we can see that 
> {{rel#13}}'s best is {{null}}, and it should be {{rel#21}} (which has a 
> non-infinite cost):
> {code:java}
> rel#13:Subset#2.ENUMERABLE.[], best=null, importance=1.0
>     rel#14:AbstractConverter.ENUMERABLE.[](input=RelSubset#12, 
> convention=ENUMERABLE, sort=[]), rowcount=117.6, cumulative cost={inf}
>     rel#21:EnumerableCorrelate.ENUMERABLE.[](left=RelSubset#19, 
> right=RelSubset#20, correlation=$cor0, joinType=inner, requiredColumns={7}), 
> rowcount=1.0, cumulative cost={1770.6000000000001 rows, 2466.0 cpu, 0.0 io}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to