[
https://issues.apache.org/jira/browse/CALCITE-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17975481#comment-17975481
]
juntaozhang commented on CALCITE-6899:
--------------------------------------
h2. normal case1:
sql:
{code:java}
select R_REGIONKEY from SALES.CUSTOMER where R_REGIONKEY > all (select
R_REGIONKEY from SALES.CUSTOMER){code}
SubQueryRemoveRule[before => after]:
{code:java}
LogicalProject#20[subset=[rel#21:RelSubset#2.NONE.any], input=[rel#19]]
LogicalFilter#18(subset=[rel#19:RelSubset#1.NONE.any])
LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any], table=[[CATALOG,
SALES, CUSTOMER]])
{code}
==>
{code:java}
LogicalProject#20[subset=[rel#21:RelSubset#2.NONE.any], input=[rel#19]]
LogicalProject#38(subset=[rel#19:RelSubset#1.NONE.any], R_REGIONKEY=[$0])
LogicalFilter#36(subset=[rel#37:RelSubset#6.NONE.single], condition=[...])
LogicalJoin#34(subset=[rel#35:RelSubset#5.NONE.any], condition=[true],
joinType=[inner])
LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, CUSTOMER]])
LogicalAggregate#32(subset=[rel#33:RelSubset#4.NONE.any], group=[{}],
m=[MAX($0)], c=[COUNT()], d=[COUNT($0)])
LogicalProject#30(subset=[rel#31:RelSubset#3.NONE.any],
R_REGIONKEY=[$0])
LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, CUSTOMER]])
{code}
h2. error case2:
sql:
{code:java}
select ename from emp where sal > all (select comm from emp){code}
SubQueryRemoveRule[before => after]:
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], ENAME=[$1],
input=[rel#15])
LogicalFilter#14(subset=[rel#15:RelSubset#1.NONE.broadcast], condition=[...])
LogicalTableScan(subset=[rel#13:RelSubset#0.NONE.any], table=[[CATALOG,
SALES, EMP]])
{code}
==>
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], input=[rel#15])
LogicalProject#37(subset=[rel#38:RelSubset#1.NONE.any], ...)
LogicalFilter#35(subset=[rel#36:RelSubset#7.NONE.single], condition=[...)])
LogicalJoin#33(subset=[rel#34:RelSubset#6.NONE.any], condition=[true],
joinType=[inner])
LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, EMP]])
LogicalProject#31(subset=[rel#32:RelSubset#5.NONE.any], m=[$0], c=[$1],
d=[$1])
LogicalAggregate#29(subset=[rel#30:RelSubset#4.NONE.any], group=[{}],
m=[MAX($0)], c=[COUNT()])
LogicalProject#27(subset=[rel#28:RelSubset#3.NONE.any], COMM=[$6])
LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, EMP]])
{code}
h2. RCA
In the normal case (case1), after _SubQueryRemoveRule_ rewrites the subquery,
{_}LogicalProject#20{_}'s input ({_}#19{_}) does not change because both the
before and after rule, _RelSubset_ have the same distribution trait
({_}any{_}). Therefore, the parent node ({_}LogicalProject#20{_}) still points
to the correct, so no further adjustment is needed.
However, in the error case (case2), after the rewrite, the new subtree
({_}rel#38{_}) has a different distribution trait ({_}any{_}) compared to the
original input ({_}rel#15{_}, which is broadcast). If the parent node
({_}LogicalProject#16{_}) still points to the old input ({_}rel#15{_}), the
planner cannot find a valid conversion path to the required trait (e.g.,
{_}ENUMERABLE.broadcast{_}).
h2. Fixed after SubQueryRemoveRule
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], input=[rel#38])
LogicalProject#37(subset=[rel#38:RelSubset#1.NONE.any], ...)
LogicalFilter#35(subset=[rel#36:RelSubset#7.NONE.single], condition=[...)])
LogicalJoin#33(subset=[rel#34:RelSubset#6.NONE.any], condition=[true],
joinType=[inner])
LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, EMP]])
LogicalProject#31(subset=[rel#32:RelSubset#5.NONE.any], m=[$0], c=[$1],
d=[$1])
LogicalAggregate#29(subset=[rel#30:RelSubset#4.NONE.any], group=[{}],
m=[MAX($0)], c=[COUNT()])
LogicalProject#27(subset=[rel#28:RelSubset#3.NONE.any], COMM=[$6])
LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any],
table=[[CATALOG, SALES, EMP]])
{code}
After applying {_}SubQueryRemoveRule{_}, if the rewritten subtree's
distribution trait changes, the parent node's input must be updated to
reference the new subtree (e.g., {_}rel#38{_}). This ensures the logical plan
remains valid and consistent with the new traits.
h2. Solution
I propose adding a new config option, {_}FILTER_WITH_PARENT{_}, to the
{_}SubQueryRemoveRule.Config{_}. This option enables access to the filter's
parent node, allowing the rule to replace the parent's input when necessary.
The reason for introducing a new config instead of modifying the existing
_FILTER_ config is that some cases do not have a parent node.
If there is any misunderstanding, do not hesitate to bring it up. Thanks
> Mismatch of Trait information results in a missing conversion exception
> -----------------------------------------------------------------------
>
> Key: CALCITE-6899
> URL: https://issues.apache.org/jira/browse/CALCITE-6899
> Project: Calcite
> Issue Type: Bug
> Reporter: xiong duan
> Priority: Major
>
> The unit test in RelOptRulesTest:
> {code:java}
> @Test void testEnumerableFilterRule() {
> final String sql = "select ename from emp where sal > all (select comm from
> emp)";
> sql(sql)
> .withVolcanoPlanner(false, p -> {
> p.addRelTraitDef(RelDistributionTraitDef.INSTANCE);
> p.addRule(CoreRules.FILTER_SUB_QUERY_TO_CORRELATE);
> p.addRule(EnumerableRules.ENUMERABLE_FILTER_RULE);
> p.addRule(EnumerableRules.ENUMERABLE_PROJECT_RULE);
> p.addRule(EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE);
> p.addRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
> p.addRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
> }).check();
> } {code}
> It throws an exception:
> {code:java}
> There are not enough rules to produce a node with desired properties:
> convention=ENUMERABLE, dist=any.
> Missing conversion is LogicalFilter[convention: NONE -> ENUMERABLE]
> There is 1 empty subset: rel#39:RelSubset#1.ENUMERABLE.broadcast, the
> relevant part of the original plan is as follows
> 14:LogicalFilter(condition=[NOT(<= SOME($5, {
> LogicalProject(COMM=[$6])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> }))])
> 8:LogicalTableScan(subset=[rel#13:RelSubset#0.NONE.any], table=[[CATALOG,
> SALES, EMP]]) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)