[ 
https://issues.apache.org/jira/browse/CALCITE-6899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17975481#comment-17975481
 ] 

juntaozhang commented on CALCITE-6899:
--------------------------------------

h2. normal case1:

sql:
{code:java}
select R_REGIONKEY from SALES.CUSTOMER where R_REGIONKEY > all (select 
R_REGIONKEY from SALES.CUSTOMER){code}
SubQueryRemoveRule[before => after]:

 
{code:java}
LogicalProject#20[subset=[rel#21:RelSubset#2.NONE.any], input=[rel#19]]
  LogicalFilter#18(subset=[rel#19:RelSubset#1.NONE.any])
    LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any], table=[[CATALOG, 
SALES, CUSTOMER]])
{code}
 

==>

 
{code:java}
LogicalProject#20[subset=[rel#21:RelSubset#2.NONE.any], input=[rel#19]]
  LogicalProject#38(subset=[rel#19:RelSubset#1.NONE.any], R_REGIONKEY=[$0])
    LogicalFilter#36(subset=[rel#37:RelSubset#6.NONE.single], condition=[...])
      LogicalJoin#34(subset=[rel#35:RelSubset#5.NONE.any], condition=[true], 
joinType=[inner])
        LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, CUSTOMER]])
        LogicalAggregate#32(subset=[rel#33:RelSubset#4.NONE.any], group=[{}], 
m=[MAX($0)], c=[COUNT()], d=[COUNT($0)])
          LogicalProject#30(subset=[rel#31:RelSubset#3.NONE.any], 
R_REGIONKEY=[$0])
            LogicalTableScan#12(subset=[rel#17:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, CUSTOMER]])
{code}
 
h2. error case2:

sql:
{code:java}
select ename from emp where sal > all (select comm from emp){code}
SubQueryRemoveRule[before => after]:

 
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], ENAME=[$1], 
input=[rel#15])
  LogicalFilter#14(subset=[rel#15:RelSubset#1.NONE.broadcast], condition=[...])
    LogicalTableScan(subset=[rel#13:RelSubset#0.NONE.any], table=[[CATALOG, 
SALES, EMP]])
{code}
 

==>

 
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], input=[rel#15])
  LogicalProject#37(subset=[rel#38:RelSubset#1.NONE.any], ...)
    LogicalFilter#35(subset=[rel#36:RelSubset#7.NONE.single], condition=[...)])
      LogicalJoin#33(subset=[rel#34:RelSubset#6.NONE.any], condition=[true], 
joinType=[inner])
        LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, EMP]])
        LogicalProject#31(subset=[rel#32:RelSubset#5.NONE.any], m=[$0], c=[$1], 
d=[$1])
          LogicalAggregate#29(subset=[rel#30:RelSubset#4.NONE.any], group=[{}], 
m=[MAX($0)], c=[COUNT()])
            LogicalProject#27(subset=[rel#28:RelSubset#3.NONE.any], COMM=[$6])
              LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, EMP]])
{code}
 
h2. RCA

In the normal case (case1), after _SubQueryRemoveRule_ rewrites the subquery, 
{_}LogicalProject#20{_}'s input ({_}#19{_}) does not change because both the 
before and after rule, _RelSubset_ have the same distribution trait 
({_}any{_}). Therefore, the parent node ({_}LogicalProject#20{_}) still points 
to the correct, so no further adjustment is needed.

However, in the error case (case2), after the rewrite, the new subtree 
({_}rel#38{_}) has a different distribution trait ({_}any{_}) compared to the 
original input ({_}rel#15{_}, which is broadcast). If the parent node 
({_}LogicalProject#16{_}) still points to the old input ({_}rel#15{_}), the 
planner cannot find a valid conversion path to the required trait (e.g., 
{_}ENUMERABLE.broadcast{_}).
h2. Fixed after SubQueryRemoveRule

 
{code:java}
LogicalProject#16(subset=[rel#17:RelSubset#2.NONE.any], input=[rel#38])
  LogicalProject#37(subset=[rel#38:RelSubset#1.NONE.any], ...)
    LogicalFilter#35(subset=[rel#36:RelSubset#7.NONE.single], condition=[...)])
      LogicalJoin#33(subset=[rel#34:RelSubset#6.NONE.any], condition=[true], 
joinType=[inner])
        LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, EMP]])
        LogicalProject#31(subset=[rel#32:RelSubset#5.NONE.any], m=[$0], c=[$1], 
d=[$1])
          LogicalAggregate#29(subset=[rel#30:RelSubset#4.NONE.any], group=[{}], 
m=[MAX($0)], c=[COUNT()])
            LogicalProject#27(subset=[rel#28:RelSubset#3.NONE.any], COMM=[$6])
              LogicalTableScan#8(subset=[rel#13:RelSubset#0.NONE.any], 
table=[[CATALOG, SALES, EMP]])
{code}
 

After applying {_}SubQueryRemoveRule{_}, if the rewritten subtree's 
distribution trait changes, the parent node's input must be updated to 
reference the new subtree (e.g., {_}rel#38{_}). This ensures the logical plan 
remains valid and consistent with the new traits.
h2. Solution

I propose adding a new config option, {_}FILTER_WITH_PARENT{_}, to the 
{_}SubQueryRemoveRule.Config{_}. This option enables access to the filter's 
parent node, allowing the rule to replace the parent's input when necessary.

The reason for introducing a new config instead of modifying the existing 
_FILTER_ config is that some cases do not have a parent node.

 

If there is any misunderstanding, do not hesitate to bring it up. Thanks

> Mismatch of Trait information results in a missing conversion exception
> -----------------------------------------------------------------------
>
>                 Key: CALCITE-6899
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6899
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: xiong duan
>            Priority: Major
>
> The unit test in RelOptRulesTest:
> {code:java}
> @Test void testEnumerableFilterRule() {
>   final String sql = "select ename from emp where sal > all (select comm from 
> emp)";
>   sql(sql)
>       .withVolcanoPlanner(false, p -> {
>         p.addRelTraitDef(RelDistributionTraitDef.INSTANCE);
>         p.addRule(CoreRules.FILTER_SUB_QUERY_TO_CORRELATE);
>         p.addRule(EnumerableRules.ENUMERABLE_FILTER_RULE);
>         p.addRule(EnumerableRules.ENUMERABLE_PROJECT_RULE);
>         p.addRule(EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE);
>         p.addRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
>         p.addRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>       }).check();
> } {code}
> It throws an exception:
> {code:java}
> There are not enough rules to produce a node with desired properties: 
> convention=ENUMERABLE, dist=any.
> Missing conversion is LogicalFilter[convention: NONE -> ENUMERABLE]
> There is 1 empty subset: rel#39:RelSubset#1.ENUMERABLE.broadcast, the 
> relevant part of the original plan is as follows
> 14:LogicalFilter(condition=[NOT(<= SOME($5, {
> LogicalProject(COMM=[$6])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> }))])
>   8:LogicalTableScan(subset=[rel#13:RelSubset#0.NONE.any], table=[[CATALOG, 
> SALES, EMP]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to