Ruben Q L created CALCITE-6674:
----------------------------------

             Summary: Make RelDecorrelator rules configurable
                 Key: CALCITE-6674
                 URL: https://issues.apache.org/jira/browse/CALCITE-6674
             Project: Calcite
          Issue Type: Improvement
          Components: core
            Reporter: Ruben Q L
            Assignee: Ruben Q L
             Fix For: 1.39.0


RelDecorrelator decorrelates a query in two main steps:
 - First, a few Correlates cases are removed via rules (in 
{{removeCorrelationViaRule}} method).
 - Then, the main decorrelation logic is applied ({{{}decorrelateRel{}}} 
methods called by reflection).

Currently, the rules applied on the first step are hardcoded, and cannot be 
configured.
We are facing a situation where a Correlate is converted via one of these 
hardcoded rules (RemoveCorrelationForScalarAggregateRule), when in fact the 
main decorrelation logic (if the rule were not applied) would offer an arguably 
more beneficial plan:
{noformat}
-- Original (sub)plan
LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0, 1}])
  LogicalJoin(condition=[=($0, $5)], joinType=[inner])
    LogicalTableScan(table=[[session, partsupp]])
    LogicalProject(p_partkey=[$0])
      LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
        LogicalTableScan(table=[[session, part]])
  LogicalProject(EXPR$0=[*(0.50, $0)])
    LogicalAggregate(group=[{}], agg#0=[SUM($0)])
      LogicalProject(l_quantity=[$4])
        LogicalFilter(condition=[AND(=($1, $cor0.ps_partkey), =($2, 
$cor0.ps_suppkey), SEARCH($10, Sarg[[1994-01-01..1995-01-01)]))])
          LogicalTableScan(table=[[session, lineitem]])

-- Decorrelation via  RemoveCorrelationForScalarAggregateRule
LogicalProject(ps_partkey=[$0], ps_suppkey=[$1], ps_availqty=[$2], 
ps_supplycost=[$3], ps_comment=[$4], p_partkey=[$5], $f6=[*(0.50, $6)])
  LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}], agg#0=[SUM($6)])
    LogicalProject(ps_partkey=[$0], ps_suppkey=[$1], ps_availqty=[$2], 
ps_supplycost=[$3], ps_comment=[$4], p_partkey=[$5], l_quantity=[$10])
      LogicalJoin(condition=[AND(=($7, $0), =($8, $1), SEARCH($16, 
Sarg[[1994-01-01..1995-01-01)]))], joinType=[left])
        LogicalJoin(condition=[=($0, $5)], joinType=[inner])
          LogicalTableScan(table=[[session, partsupp]])
          LogicalProject(p_partkey=[$0])
            LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
              LogicalTableScan(table=[[session, part]])
        LogicalProject(l_orderkey=[$0], l_partkey=[$1], l_suppkey=[$2], 
l_linenumber=[$3], l_quantity=[$4], l_extendedprice=[$5], l_discount=[$6], 
l_tax=[$7], l_returnflag=[$8], l_linestatus=[$9], l_shipdate=[$10], 
l_commitdate=[$11], l_receiptdate=[$12], l_shipinstruct=[$13], 
l_shipmode=[$14], l_comment=[$15], nullIndicator=[true])
          LogicalTableScan(table=[[session, lineitem]])

-- -- Decorrelation via main logic (without 
RemoveCorrelationForScalarAggregateRule)
LogicalJoin(condition=[AND(=($0, $6), =($1, $7))], joinType=[left])
  LogicalJoin(condition=[=($0, $5)], joinType=[inner])
    LogicalTableScan(table=[[session, partsupp]])
    LogicalProject(p_partkey=[$0])
      LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
        LogicalTableScan(table=[[session, part]])
  LogicalAggregate(group=[{0, 1}], agg#0=[SUM($2)])
    LogicalProject(l_partkey=[$1], l_suppkey=[$2], l_quantity=[$4])
      LogicalFilter(condition=[AND(SEARCH($10, Sarg[[1994-01-01..1995-01-01)]), 
IS NOT NULL($1), IS NOT NULL($2))])
        LogicalTableScan(table=[[session, lineitem]])
{noformat}
The idea of this ticket is to make configurable the rules used by 
{{{}RelDecorrelator#removeCorrelationViaRule{}}}. By default, everything will 
behave as before (the same "default" rules will be applied), so we keep 
backwards compatibility; but we shall offer new methods to allow 
RelDecorrelator's caller to tune the rules to be used in this step.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to