[ 
https://issues.apache.org/jira/browse/CALCITE-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5401:
-------------------------------
    Description: 
TLDR; {{CoreRules.AGGREGATE_REMOVE}} is fired by a {{HepPlanner}} but while 
removing the Aggregate, instead of returning the Aggregate's input, it returns 
a VolcanoPlanner's RelSubset with the input, which leads to unforeseeable 
consequences.

 

Details: This seems a strange issue that happens because several factors occur.

I first reproduced it on my application with the following query (on TPCH):
{code:sql}
SELECT c.c_custkey
FROM customer c
WHERE c.c_name IN ('271', '272', '273', '274', '275', '276', '342', '343', 
'344', '345','346', '347', '348', '349', '350', '351',  '352', '353', '354',  
'355', '356', '357', '358', '359', '360')
AND EXISTS (SELECT 1 FROM orders o WHERE o.o_custkey = c.c_custkey)
{code}
But the issue can be reproduced also in Calcite by adding this test into 
{{{}HepPlannerTest{}}}:
{code:java}
  @Test void testAggregateRemove() {
    final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
c.withAggregateUnique(true));
    final RelNode root =
        builder
            .values(new String[]{"i"}, 1, 2, 3) // important to have values 
sorted
            .distinct()
            .build();
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
        .build();
    final HepPlanner planner = new HepPlanner(program);
    planner.setRoot(root);
    final RelNode result = planner.findBestExp();
    assertThat(result, is(instanceOf(LogicalValues.class))); // fails because 
result is a RelSubset
  }
{code}
The important elements are: firstly our {{RelOptCluster}} has a 
{{VolcanoPlanner}} as planner (so any {{relNode.getCluster().getPlanner()}} 
call that we execute will return a {{VolcanoPlanner}} instance). Nevertheless 
we also apply some rules via a {{{}HepPlanner{}}}. I think this is a quite 
common strategy in Calcite clients to obtain a better performance: first apply 
a subset of rules that are always beneficial via a {{{}HepPlanner{}}}, and then 
apply the "main" set of rules via the cost-based {{{}VolcanoPlanner{}}}.

Secondly, we have {{{}AggregateRemoveRule{}}}, which we use in the 
{{HepPlanner}} phase.
This rule contains the following code:
{code:java}
  @Override public void onMatch(RelOptRuleCall call) {
    final Aggregate aggregate = call.rel(0);
    final RelNode input = aggregate.getInput();
    ...
    final RelNode newInput = convert(input, 
aggregate.getTraitSet().simplify()); // <-- *** [1]
    relBuilder.push(newInput);
    ...
    call.getPlanner().prune(aggregate);  // <-- *** [2]
    call.transformTo(relBuilder.build());
  }
{code}
Notice the line [2] which uses {{call.getPlanner()}} to call the {{prune}} 
method. By using {{call.getPlanner()}} we get the correct planner of the rule 
that is being fired, in this case a {{{}HepPlanner{}}}, so we end up calling 
{{{}HepPlanner#prune{}}}, which is fine.
However, the line [1] calls the {{RelOptRule#convert}} static method:
{code:java}
  public static RelNode convert(RelNode rel, RelTraitSet toTraits) {
    RelOptPlanner planner = rel.getCluster().getPlanner(); // <-- *** [3]
    RelTraitSet outTraits = rel.getTraitSet();
    for (int i = 0; i < toTraits.size(); i++) {
      RelTrait toTrait = toTraits.getTrait(i);
      if (toTrait != null)
        outTraits = outTraits.replace(i, toTrait);
    }
    if (rel.getTraitSet().matches(outTraits))
      return rel;
    return planner.changeTraits(rel, outTraits); // <-- *** [4]
  }
{code}
Notice how in this case, the planner is obtained from the relNode's cluster 
[3], in our case that would be the {{{}VolcanoPlanner{}}}, which is potentially 
problematic. Further down, if the relNode matches the {{{}outTraits{}}}, no 
action is done and the same relNode is returned, no problem here. But, if it 
does not match them, then {{RelOptPlanner#changeTraits}} will be called, i.e. 
{{VolcanoPlanner#changeTraits}} [4], and this is where the problem will 
originate: in our scenario {{VolcanoPlanner#changeTraits}} will return a 
Volcano's {{{}RelSubset{}}}, which is completely unhandable by the 
{{HepPlanner}} that triggered the rule, and that leads to the incorrect plan 
returned by the {{{}HepPlanner{}}}.

In this case, what happens with our original query ({{{}LogicalValues{}}} with 
sorted values), we get to {{RelOptRule#convert}} with the RelNode being a 
{{LogicalValues}} with {{Convention.NONE}} + {{{}Collation[0]{}}}, and the 
{{toTraits}} are the ones from the {{LogicalAggregate}} that we are removing: 
{{Convention.NONE}} + {{Collation[]}} . Since the traits from the 
{{LogicalValues}} do not match the LogicalAgggregate traits ({{{}Collation[0] 
!= Collation[]{}}}), the {{RelOptPlanner#changeTraits}} is called and the 
problem occurs. I am not sure why here {{RelTraitSet#matches}} is used (which 
computes an exact match, hence returning false), rather than 
{{{}RelTraitSet#satisfies{}}}, which would have returned true, because a sorted 
{{LogicalValues}} ({{{}Collation[0]{}}}) satisfies the unsorted 
{{{}Collation[]{}}}, but I assume there is a reason for that.

As a workaround, if the {{LogicalValues}} elements are NOT in order, then the 
problem is avoided: we deal with a {{LogicalValues}} with {{Collation[]}} , so 
inside {{RelOptRule#convert}} the {{LogicalValues}} traits ({{{}Convention.NONE 
+ Collation[]{}}}) match the {{LogicalAggregates}} ones ({{{}Convention.NONE + 
Collation[]{}}}), so the method returns without calling 
{{{}RelOptPlanner#changeTraits{}}}, so the problem does not happen. This can be 
confirmed by modifying the proposed test:
{code:java}
  @Test void testAggregateRemoveOk() {
    final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
c.withAggregateUnique(true));
    final RelNode root =
        builder
            .values(new String[]{"i"}, 1, 42, 3) // not sorted
            .distinct()
            .build();
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
        .build();
    final HepPlanner planner = new HepPlanner(program);
    planner.setRoot(root);
    final RelNode result = planner.findBestExp();
    assertThat(result, is(instanceOf(LogicalValues.class))); // ok
  }
{code}

  was:
TLDR; {{CoreRules.AGGREGATE_REMOVE}} is fired by a {{HepPlanner}} but while 
removing the Aggregate, instead of returning the Aggregate's input, it returns 
a VolcanoPlanner's RelSubset with the input, which leads to unforeseeable 
consequences.

 

Details: This seems a strange issue that happens because several factors occur.

I first reproduced it on my application with the following query (on TPCH):
{code:sql}
SELECT c.c_custkey
FROM customer c
WHERE c_name IN ('271', '272', '273', '274', '275', '276', '342', '343', '344', 
'345','346', '347', '348', '349', '350', '351',  '352', '353', '354',  '355', 
'356', '357', '358', '359', '360')
AND EXISTS (SELECT 1 FROM orders o WHERE o.o_custkey = c.c_custkey)
{code}
But the issue can be reproduced also in Calcite by adding this test into 
{{{}HepPlannerTest{}}}:
{code:java}
  @Test void testAggregateRemove() {
    final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
c.withAggregateUnique(true));
    final RelNode root =
        builder
            .values(new String[]{"i"}, 1, 2, 3) // important to have values 
sorted
            .distinct()
            .build();
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
        .build();
    final HepPlanner planner = new HepPlanner(program);
    planner.setRoot(root);
    final RelNode result = planner.findBestExp();
    assertThat(result, is(instanceOf(LogicalValues.class))); // fails because 
result is a RelSubset
  }
{code}
The important elements are: firstly our {{RelOptCluster}} has a 
{{VolcanoPlanner}} as planner (so any {{relNode.getCluster().getPlanner()}} 
call that we execute will return a {{VolcanoPlanner}} instance). Nevertheless 
we also apply some rules via a {{{}HepPlanner{}}}. I think this is a quite 
common strategy in Calcite clients to obtain a better performance: first apply 
a subset of rules that are always beneficial via a {{{}HepPlanner{}}}, and then 
apply the "main" set of rules via the cost-based {{{}VolcanoPlanner{}}}.

Secondly, we have {{{}AggregateRemoveRule{}}}, which we use in the 
{{HepPlanner}} phase.
This rule contains the following code:
{code:java}
  @Override public void onMatch(RelOptRuleCall call) {
    final Aggregate aggregate = call.rel(0);
    final RelNode input = aggregate.getInput();
    ...
    final RelNode newInput = convert(input, 
aggregate.getTraitSet().simplify()); // <-- *** [1]
    relBuilder.push(newInput);
    ...
    call.getPlanner().prune(aggregate);  // <-- *** [2]
    call.transformTo(relBuilder.build());
  }
{code}
Notice the line [2] which uses {{call.getPlanner()}} to call the {{prune}} 
method. By using {{call.getPlanner()}} we get the correct planner of the rule 
that is being fired, in this case a {{{}HepPlanner{}}}, so we end up calling 
{{{}HepPlanner#prune{}}}, which is fine.
However, the line [1] calls the {{RelOptRule#convert}} static method:
{code:java}
  public static RelNode convert(RelNode rel, RelTraitSet toTraits) {
    RelOptPlanner planner = rel.getCluster().getPlanner(); // <-- *** [3]
    RelTraitSet outTraits = rel.getTraitSet();
    for (int i = 0; i < toTraits.size(); i++) {
      RelTrait toTrait = toTraits.getTrait(i);
      if (toTrait != null)
        outTraits = outTraits.replace(i, toTrait);
    }
    if (rel.getTraitSet().matches(outTraits))
      return rel;
    return planner.changeTraits(rel, outTraits); // <-- *** [4]
  }
{code}
Notice how in this case, the planner is obtained from the relNode's cluster 
[3], in our case that would be the {{{}VolcanoPlanner{}}}, which is potentially 
problematic. Further down, if the relNode matches the {{{}outTraits{}}}, no 
action is done and the same relNode is returned, no problem here. But, if it 
does not match them, then {{RelOptPlanner#changeTraits}} will be called, i.e. 
{{VolcanoPlanner#changeTraits}} [4], and this is where the problem will 
originate: in our scenario {{VolcanoPlanner#changeTraits}} will return a 
Volcano's {{{}RelSubset{}}}, which is completely unhandable by the 
{{HepPlanner}} that triggered the rule, and that leads to the incorrect plan 
returned by the {{{}HepPlanner{}}}.

In this case, what happens with our original query ({{{}LogicalValues{}}} with 
sorted values), we get to {{RelOptRule#convert}} with the RelNode being a 
{{LogicalValues}} with {{Convention.NONE}} + {{{}Collation[0]{}}}, and the 
{{toTraits}} are the ones from the {{LogicalAggregate}} that we are removing: 
{{Convention.NONE}} + {{Collation[]}} . Since the traits from the 
{{LogicalValues}} do not match the LogicalAgggregate traits ({{{}Collation[0] 
!= Collation[]{}}}), the {{RelOptPlanner#changeTraits}} is called and the 
problem occurs. I am not sure why here {{RelTraitSet#matches}} is used (which 
computes an exact match, hence returning false), rather than 
{{{}RelTraitSet#satisfies{}}}, which would have returned true, because a sorted 
{{LogicalValues}} ({{{}Collation[0]{}}}) satisfies the unsorted 
{{{}Collation[]{}}}, but I assume there is a reason for that.

As a workaround, if the {{LogicalValues}} elements are NOT in order, then the 
problem is avoided: we deal with a {{LogicalValues}} with {{Collation[]}} , so 
inside {{RelOptRule#convert}} the {{LogicalValues}} traits ({{{}Convention.NONE 
+ Collation[]{}}}) match the {{LogicalAggregates}} ones ({{{}Convention.NONE + 
Collation[]{}}}), so the method returns without calling 
{{{}RelOptPlanner#changeTraits{}}}, so the problem does not happen. This can be 
confirmed by modifying the proposed test:
{code:java}
  @Test void testAggregateRemoveOk() {
    final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
c.withAggregateUnique(true));
    final RelNode root =
        builder
            .values(new String[]{"i"}, 1, 42, 3) // not sorted
            .distinct()
            .build();
    final HepProgram program = new HepProgramBuilder()
        .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
        .build();
    final HepPlanner planner = new HepPlanner(program);
    planner.setRoot(root);
    final RelNode result = planner.findBestExp();
    assertThat(result, is(instanceOf(LogicalValues.class))); // ok
  }
{code}


> Rule fired by HepPlanner can return Volcano's RelSubset
> -------------------------------------------------------
>
>                 Key: CALCITE-5401
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5401
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ruben Q L
>            Priority: Major
>
> TLDR; {{CoreRules.AGGREGATE_REMOVE}} is fired by a {{HepPlanner}} but while 
> removing the Aggregate, instead of returning the Aggregate's input, it 
> returns a VolcanoPlanner's RelSubset with the input, which leads to 
> unforeseeable consequences.
>  
> Details: This seems a strange issue that happens because several factors 
> occur.
> I first reproduced it on my application with the following query (on TPCH):
> {code:sql}
> SELECT c.c_custkey
> FROM customer c
> WHERE c.c_name IN ('271', '272', '273', '274', '275', '276', '342', '343', 
> '344', '345','346', '347', '348', '349', '350', '351',  '352', '353', '354',  
> '355', '356', '357', '358', '359', '360')
> AND EXISTS (SELECT 1 FROM orders o WHERE o.o_custkey = c.c_custkey)
> {code}
> But the issue can be reproduced also in Calcite by adding this test into 
> {{{}HepPlannerTest{}}}:
> {code:java}
>   @Test void testAggregateRemove() {
>     final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
> c.withAggregateUnique(true));
>     final RelNode root =
>         builder
>             .values(new String[]{"i"}, 1, 2, 3) // important to have values 
> sorted
>             .distinct()
>             .build();
>     final HepProgram program = new HepProgramBuilder()
>         .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
>         .build();
>     final HepPlanner planner = new HepPlanner(program);
>     planner.setRoot(root);
>     final RelNode result = planner.findBestExp();
>     assertThat(result, is(instanceOf(LogicalValues.class))); // fails because 
> result is a RelSubset
>   }
> {code}
> The important elements are: firstly our {{RelOptCluster}} has a 
> {{VolcanoPlanner}} as planner (so any {{relNode.getCluster().getPlanner()}} 
> call that we execute will return a {{VolcanoPlanner}} instance). Nevertheless 
> we also apply some rules via a {{{}HepPlanner{}}}. I think this is a quite 
> common strategy in Calcite clients to obtain a better performance: first 
> apply a subset of rules that are always beneficial via a {{{}HepPlanner{}}}, 
> and then apply the "main" set of rules via the cost-based 
> {{{}VolcanoPlanner{}}}.
> Secondly, we have {{{}AggregateRemoveRule{}}}, which we use in the 
> {{HepPlanner}} phase.
> This rule contains the following code:
> {code:java}
>   @Override public void onMatch(RelOptRuleCall call) {
>     final Aggregate aggregate = call.rel(0);
>     final RelNode input = aggregate.getInput();
>     ...
>     final RelNode newInput = convert(input, 
> aggregate.getTraitSet().simplify()); // <-- *** [1]
>     relBuilder.push(newInput);
>     ...
>     call.getPlanner().prune(aggregate);  // <-- *** [2]
>     call.transformTo(relBuilder.build());
>   }
> {code}
> Notice the line [2] which uses {{call.getPlanner()}} to call the {{prune}} 
> method. By using {{call.getPlanner()}} we get the correct planner of the rule 
> that is being fired, in this case a {{{}HepPlanner{}}}, so we end up calling 
> {{{}HepPlanner#prune{}}}, which is fine.
> However, the line [1] calls the {{RelOptRule#convert}} static method:
> {code:java}
>   public static RelNode convert(RelNode rel, RelTraitSet toTraits) {
>     RelOptPlanner planner = rel.getCluster().getPlanner(); // <-- *** [3]
>     RelTraitSet outTraits = rel.getTraitSet();
>     for (int i = 0; i < toTraits.size(); i++) {
>       RelTrait toTrait = toTraits.getTrait(i);
>       if (toTrait != null)
>         outTraits = outTraits.replace(i, toTrait);
>     }
>     if (rel.getTraitSet().matches(outTraits))
>       return rel;
>     return planner.changeTraits(rel, outTraits); // <-- *** [4]
>   }
> {code}
> Notice how in this case, the planner is obtained from the relNode's cluster 
> [3], in our case that would be the {{{}VolcanoPlanner{}}}, which is 
> potentially problematic. Further down, if the relNode matches the 
> {{{}outTraits{}}}, no action is done and the same relNode is returned, no 
> problem here. But, if it does not match them, then 
> {{RelOptPlanner#changeTraits}} will be called, i.e. 
> {{VolcanoPlanner#changeTraits}} [4], and this is where the problem will 
> originate: in our scenario {{VolcanoPlanner#changeTraits}} will return a 
> Volcano's {{{}RelSubset{}}}, which is completely unhandable by the 
> {{HepPlanner}} that triggered the rule, and that leads to the incorrect plan 
> returned by the {{{}HepPlanner{}}}.
> In this case, what happens with our original query ({{{}LogicalValues{}}} 
> with sorted values), we get to {{RelOptRule#convert}} with the RelNode being 
> a {{LogicalValues}} with {{Convention.NONE}} + {{{}Collation[0]{}}}, and the 
> {{toTraits}} are the ones from the {{LogicalAggregate}} that we are removing: 
> {{Convention.NONE}} + {{Collation[]}} . Since the traits from the 
> {{LogicalValues}} do not match the LogicalAgggregate traits ({{{}Collation[0] 
> != Collation[]{}}}), the {{RelOptPlanner#changeTraits}} is called and the 
> problem occurs. I am not sure why here {{RelTraitSet#matches}} is used (which 
> computes an exact match, hence returning false), rather than 
> {{{}RelTraitSet#satisfies{}}}, which would have returned true, because a 
> sorted {{LogicalValues}} ({{{}Collation[0]{}}}) satisfies the unsorted 
> {{{}Collation[]{}}}, but I assume there is a reason for that.
> As a workaround, if the {{LogicalValues}} elements are NOT in order, then the 
> problem is avoided: we deal with a {{LogicalValues}} with {{Collation[]}} , 
> so inside {{RelOptRule#convert}} the {{LogicalValues}} traits 
> ({{{}Convention.NONE + Collation[]{}}}) match the {{LogicalAggregates}} ones 
> ({{{}Convention.NONE + Collation[]{}}}), so the method returns without 
> calling {{{}RelOptPlanner#changeTraits{}}}, so the problem does not happen. 
> This can be confirmed by modifying the proposed test:
> {code:java}
>   @Test void testAggregateRemoveOk() {
>     final RelBuilder builder = RelBuilderTest.createBuilder(c -> 
> c.withAggregateUnique(true));
>     final RelNode root =
>         builder
>             .values(new String[]{"i"}, 1, 42, 3) // not sorted
>             .distinct()
>             .build();
>     final HepProgram program = new HepProgramBuilder()
>         .addRuleInstance(CoreRules.AGGREGATE_REMOVE)
>         .build();
>     final HepPlanner planner = new HepPlanner(program);
>     planner.setRoot(root);
>     final RelNode result = planner.findBestExp();
>     assertThat(result, is(instanceOf(LogicalValues.class))); // ok
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to