[
https://issues.apache.org/jira/browse/CALCITE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989097#comment-16989097
]
Xiening Dai commented on CALCITE-2223:
--------------------------------------
[~volodymyr] [~vladimirsitnikov] Thanks for your information.
I took a look at [~vladimirsitnikov]'s proposal
([https://github.com/apache/orc/pull/455|https://github.com/apache/orc/pull/455)])
and can see one flaw in the cycle detection logic. You currently only check
the subset that the RelNode belongs to (aka through VolcanoPlanner#getSubset).
However, a RelNode can become best RelNode for a subset as long as it's traits
satisfy subset's trait requirement. Let say we have a new RelNode X created and
ended up in RelSubset A (A's trait set equals to X's traits). Then during the
cost propogation, X could become RelSubset B's best as long as X satisfies B -
that happens a lot in reality when B has more relaxed traits. At the time, it
could form a cycle which goes through subset B, and this couldn't be detected
if you just search along the path of subset A while X was added.
If you'd like to solve this problem, you would have to start the search by all
subsets that RelNode X satisfies, not just the one it belongs to. And then when
a new RelSubset created, you would have to do this again. I am not sure if this
is feasible in terms of the runtime performance. Even if it was, there are so
many cases that would produce exactly the pattern I gave in the example, and I
don't think you can just ban them together.
> ProjectMergeRule is infinitely matched when is applied after
> ProjectReduceExpressionsRule
> -----------------------------------------------------------------------------------------
>
> Key: CALCITE-2223
> URL: https://issues.apache.org/jira/browse/CALCITE-2223
> Project: Calcite
> Issue Type: Bug
> Reporter: Vova Vysotskyi
> Priority: Critical
> Labels: pull-request-available
> Attachments:
> TestLimitWithExchanges_testPushLimitPastUnionExchange.png, heap_overview.png,
> provenance_contents.png
>
>
> For queries like this:
> {code:sql}
> select t1.f from (select cast(f as int) f, f from (select cast(f as int) f
> from (values('1')) t(f))) as t1
> {code}
> OOM is thrown when {{ProjectMergeRule}} is applied before applying
> {{ProjectReduceExpressionsRule}} in VolcanoPlanner.
> A simple test to reproduce this issue (in {{RelOptRulesTest}}):
> {code:java}
> @Test public void testOomProjectMergeRule() {
> RelBuilder relBuilder =
> RelBuilder.create(RelBuilderTest.config().build());
> RelNode relNode = relBuilder
> .values(new String[]{"f"}, "1")
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f"))
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f0"),
> relBuilder.alias(relBuilder.field(0), "f"))
> .project(
> relBuilder.alias(relBuilder.field(0), "f"))
> .build();
> RelOptPlanner planner = relNode.getCluster().getPlanner();
> RuleSet ruleSet =
> RuleSets.ofList(
> ReduceExpressionsRule.PROJECT_INSTANCE,
> new ProjectMergeRuleWithLongerName(),
> EnumerableRules.ENUMERABLE_PROJECT_RULE,
> EnumerableRules.ENUMERABLE_VALUES_RULE);
> Program program = Programs.of(ruleSet);
> RelTraitSet toTraits =
> relNode.getCluster().traitSet()
> .replace(0, EnumerableConvention.INSTANCE);
> RelNode output = program.run(planner, relNode, toTraits,
> ImmutableList.<RelOptMaterialization>of(),
> ImmutableList.<RelOptLattice>of());
> // check for output
> }
> /**
> * ProjectMergeRule inheritor which has
> * class name greater than ProjectReduceExpressionsRule class name
> (String.compareTo()).
> *
> * It is needed for RuleQueue.popMatch() method
> * to apply this rule before ProjectReduceExpressionsRule.
> */
> private static class ProjectMergeRuleWithLongerName extends
> ProjectMergeRule {
> public ProjectMergeRuleWithLongerName() {
> super(true, RelFactories.LOGICAL_BUILDER);
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)