Hi people,

I am doing some benchmarking with Calcite for the sql-api in Apache Wayang that 
requires typically multiconditional joins to be split into "binary" joins ala:
LogicalJoin(condition=[AND(=($0, $27), =($10, $28), =($34, $2))], 
joinType=[inner]): rowcount = 118.65234375, cumulative cost = 1038.96484375
                LogicalJoin(condition=[=($0, $11)], joinType=[inner]): rowcount 
= 351.5625, cumulative cost = 820.3125
                  LogicalJoin(condition=[=($0, $3)], joinType=[inner]): 
rowcount = 93.75, cumulative cost = 343.75
                    LogicalFilter(condition=[SEARCH($1, Sarg['cs':CHAR(11), 
'gaming':CHAR(11), 'mathematica']:CHAR(11))]): rowcount = 25.0, cumulative cost 
= 125.0
                      LogicalTableScan(table=[[postgres, site]]): rowcount = 
100.0, cumulative cost = 100.0
                    LogicalFilter(condition=[SEARCH($6, Sarg[[10..100000]])]): 
rowcount = 25.0, cumulative cost = 125.0
                      LogicalTableScan(table=[[postgres, so_user]]): rowcount = 
100.0, cumulative cost = 100.0
                  LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])]): 
rowcount = 25.0, cumulative cost = 125.0
                    LogicalTableScan(table=[[postgres, question]]): rowcount = 
100.0, cumulative cost = 100.0
                LogicalTableScan(table=[[postgres, answer]]): rowcount = 100.0, 
cumulative cost = 100.0


BinaryJoin(condition=[=($60, $2)], joinType=[inner])
  BinaryJoin(condition=[=($10, $41)], joinType=[inner])
    BinaryJoin(condition=[=($0, $27)], joinType=[inner])
      LogicalJoin(condition=[=($0, $11)], joinType=[inner])
        LogicalJoin(condition=[=($0, $3)], joinType=[inner])
          LogicalFilter(condition=[SEARCH($1, Sarg['cs':CHAR(11), 
'gaming':CHAR(11), 'mathematica']:CHAR(11))])
            LogicalTableScan(table=[[postgres, site]])
          LogicalFilter(condition=[SEARCH($6, Sarg[[10..100000]])])
            LogicalTableScan(table=[[postgres, so_user]])
        LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])])
          LogicalTableScan(table=[[postgres, question]])
      LogicalTableScan(table=[[postgres, answer]])
    LogicalTableScan(table=[[postgres, answer]])
  LogicalTableScan(table=[[postgres, answer]])

Is this something that is already supported in Calcite? I have looked at 
current Calcite rules; JoinToMultiJoinRule, LoptOptimizeJoinRule, but they 
don't quite fit my use case.
Furthermore, if it is not supported, how would one go about implementing such a 
split? I have looked at a rules-based implementation using the hep-planner. But 
I am having issues with how to translate the RexInputRef's indexes to the 
"right" place, as I need the indexes to always point to the new joining table 
rows.

Reply via email to