Ruben Q L created CALCITE-5401:
----------------------------------
Summary: Rule fired by HepPlanner can return Volcano's RelSubset
Key: CALCITE-5401
URL: https://issues.apache.org/jira/browse/CALCITE-5401
Project: Calcite
Issue Type: Bug
Components: core
Reporter: Ruben Q L
TLDR; {{CoreRules.AGGREGATE_REMOVE}} is fired by a {{HepPlanner}} but while
removing the Aggregate, instead of returning the Aggregate's input, it returns
a VolcanoPlanner's RelSubset with the input, which leads to unforeseeable
consequences.
Details: This seems a strange issue that happens because several factors occur.
I first reproduced it on my application with the following query (on TPCH):
{code:sql}
SELECT c.c_custkey
FROM customer c
WHERE c_name IN ('271', '272', '273', '274', '275', '276', '342', '343', '344',
'345','346', '347', '348', '349', '350', '351', '352', '353', '354', '355',
'356', '357', '358', '359', '360')
AND EXISTS (SELECT 1 FROM orders o WHERE o.o_custkey = c.c_custkey)
{code}
But the issue can be reproduced also in Calcite by adding this test into
{{{}HepPlannerTest{}}}:
{code:java}
@Test void testAggregateRemove() {
final RelBuilder builder = RelBuilderTest.createBuilder(c ->
c.withAggregateUnique(true));
final RelNode root =
builder
.values(new String[]{"i"}, 1, 2, 3) // important to have values
sorted
.distinct()
.build();
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(CoreRules.AGGREGATE_REMOVE)
.build();
final HepPlanner planner = new HepPlanner(program);
planner.setRoot(root);
final RelNode result = planner.findBestExp();
assertThat(result, is(instanceOf(LogicalValues.class))); // fails because
result is a RelSubset
}
{code}
The important elements are: firstly our {{RelOptCluster}} has a
{{VolcanoPlanner}} as planner (so any {{relNode.getCluster().getPlanner()}}
call that we execute will return a {{VolcanoPlanner}} instance). Nevertheless
we also apply some rules via a {{{}HepPlanner{}}}. I think this is a quite
common strategy in Calcite clients to obtain a better performance: first apply
a subset of rules that are always beneficial via a {{{}HepPlanner{}}}, and then
apply the "main" set of rules via the cost-based {{{}VolcanoPlanner{}}}.
Secondly, we have {{{}AggregateRemoveRule{}}}, which we use in the
{{HepPlanner}} phase.
This rule contains the following code:
{code:java}
@Override public void onMatch(RelOptRuleCall call) {
final Aggregate aggregate = call.rel(0);
final RelNode input = aggregate.getInput();
...
final RelNode newInput = convert(input,
aggregate.getTraitSet().simplify()); // <-- *** [1]
relBuilder.push(newInput);
...
call.getPlanner().prune(aggregate); // <-- *** [2]
call.transformTo(relBuilder.build());
}
{code}
Notice the line [2] which uses {{call.getPlanner()}} to call the {{prune}}
method. By using {{call.getPlanner()}} we get the correct planner of the rule
that is being fired, in this case a {{{}HepPlanner{}}}, so we end up calling
{{{}HepPlanner#prune{}}}, which is fine.
However, the line [1] calls the {{RelOptRule#convert}} static method:
{code:java}
public static RelNode convert(RelNode rel, RelTraitSet toTraits) {
RelOptPlanner planner = rel.getCluster().getPlanner(); // <-- *** [3]
RelTraitSet outTraits = rel.getTraitSet();
for (int i = 0; i < toTraits.size(); i++) {
RelTrait toTrait = toTraits.getTrait(i);
if (toTrait != null)
outTraits = outTraits.replace(i, toTrait);
}
if (rel.getTraitSet().matches(outTraits))
return rel;
return planner.changeTraits(rel, outTraits); // <-- *** [4]
}
{code}
Notice how in this case, the planner is obtained from the relNode's cluster
[3], in our case that would be the {{{}VolcanoPlanner{}}}, which is potentially
problematic. Further down, if the relNode matches the {{{}outTraits{}}}, no
action is done and the same relNode is returned, no problem here. But, if it
does not match them, then {{RelOptPlanner#changeTraits}} will be called, i.e.
{{VolcanoPlanner#changeTraits}} [4], and this is where the problem will
originate: in our scenario {{VolcanoPlanner#changeTraits}} will return a
Volcano's {{{}RelSubset{}}}, which is completely unhandable by the
{{HepPlanner}} that triggered the rule, and that leads to the incorrect plan
returned by the {{{}HepPlanner{}}}.
In this case, what happens with our original query ({{{}LogicalValues{}}} with
sorted values), we get to {{RelOptRule#convert}} with the RelNode being a
{{LogicalValues}} with {{Convention.NONE}} + {{{}Collation[0]{}}}, and the
{{toTraits}} are the ones from the {{LogicalAggregate}} that we are removing:
{{Convention.NONE}} + {{Collation[]}} . Since the traits from the
{{LogicalValues}} do not match the LogicalAgggregate traits ({{{}Collation[0]
!= Collation[]{}}}), the {{RelOptPlanner#changeTraits}} is called and the
problem occurs. I am not sure why here {{RelTraitSet#matches}} is used (which
computes an exact match, hence returning false), rather than
{{{}RelTraitSet#satisfies{}}}, which would have returned true, because a sorted
{{LogicalValues}} ({{{}Collation[0]{}}}) satisfies the unsorted
{{{}Collation[]{}}}, but I assume there is a reason for that.
As a workaround, if the {{LogicalValues}} elements are NOT in order, then the
problem is avoided: we deal with a {{LogicalValues}} with {{Collation[]}} , so
inside {{RelOptRule#convert}} the {{LogicalValues}} traits ({{{}Convention.NONE
+ Collation[]{}}}) match the {{LogicalAggregates}} ones ({{{}Convention.NONE +
Collation[]{}}}), so the method returns without calling
{{{}RelOptPlanner#changeTraits{}}}, so the problem does not happen. This can be
confirmed by modifying the proposed test:
{code:java}
@Test void testAggregateRemoveOk() {
final RelBuilder builder = RelBuilderTest.createBuilder(c ->
c.withAggregateUnique(true));
final RelNode root =
builder
.values(new String[]{"i"}, 1, 42, 3) // not sorted
.distinct()
.build();
final HepProgram program = new HepProgramBuilder()
.addRuleInstance(CoreRules.AGGREGATE_REMOVE)
.build();
final HepPlanner planner = new HepPlanner(program);
planner.setRoot(root);
final RelNode result = planner.findBestExp();
assertThat(result, is(instanceOf(LogicalValues.class))); // ok
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)