Hi Mads, Currently, the sql-api, converts logical plans by calcite to wayang (relational) plans as-is. For what you want to achieve, you will have to extend the WayangJoinVisitor.java class. Particularly, https://github.com/apache/incubator-wayang/blob/24d8eec742f21fd5adfbd089e93f96271c6f5a63/wayang-api/wayang-api-sql/src/main/java/org/apache/wayang/api/sql/calcite/converter/WayangJoinVisitor.java#L49
It only supports equi-joins for now; but one could extend it to support other predicates as you require. Hope this helps. Best, Kaustubh On Fri, Mar 14, 2025 at 2:11 PM Mads Sejer Pedersen <s...@itu.dk.invalid> wrote: > Hi people, > > I am doing some benchmarking with Calcite for the sql-api in Apache Wayang > that requires typically multiconditional joins to be split into "binary" > joins ala: > LogicalJoin(condition=[AND(=($0, $27), =($10, $28), =($34, $2))], > joinType=[inner]): rowcount = 118.65234375, cumulative cost = 1038.96484375 > LogicalJoin(condition=[=($0, $11)], joinType=[inner]): > rowcount = 351.5625, cumulative cost = 820.3125 > LogicalJoin(condition=[=($0, $3)], joinType=[inner]): > rowcount = 93.75, cumulative cost = 343.75 > LogicalFilter(condition=[SEARCH($1, > Sarg['cs':CHAR(11), 'gaming':CHAR(11), 'mathematica']:CHAR(11))]): rowcount > = 25.0, cumulative cost = 125.0 > LogicalTableScan(table=[[postgres, site]]): rowcount > = 100.0, cumulative cost = 100.0 > LogicalFilter(condition=[SEARCH($6, > Sarg[[10..100000]])]): rowcount = 25.0, cumulative cost = 125.0 > LogicalTableScan(table=[[postgres, so_user]]): > rowcount = 100.0, cumulative cost = 100.0 > LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])]): > rowcount = 25.0, cumulative cost = 125.0 > LogicalTableScan(table=[[postgres, question]]): > rowcount = 100.0, cumulative cost = 100.0 > LogicalTableScan(table=[[postgres, answer]]): rowcount = > 100.0, cumulative cost = 100.0 > > > BinaryJoin(condition=[=($60, $2)], joinType=[inner]) > BinaryJoin(condition=[=($10, $41)], joinType=[inner]) > BinaryJoin(condition=[=($0, $27)], joinType=[inner]) > LogicalJoin(condition=[=($0, $11)], joinType=[inner]) > LogicalJoin(condition=[=($0, $3)], joinType=[inner]) > LogicalFilter(condition=[SEARCH($1, Sarg['cs':CHAR(11), > 'gaming':CHAR(11), 'mathematica']:CHAR(11))]) > LogicalTableScan(table=[[postgres, site]]) > LogicalFilter(condition=[SEARCH($6, Sarg[[10..100000]])]) > LogicalTableScan(table=[[postgres, so_user]]) > LogicalFilter(condition=[SEARCH($6, Sarg[[0..100]])]) > LogicalTableScan(table=[[postgres, question]]) > LogicalTableScan(table=[[postgres, answer]]) > LogicalTableScan(table=[[postgres, answer]]) > LogicalTableScan(table=[[postgres, answer]]) > > Does anyone know of a Calcite rule that already does something like this, > or have a general idea about how such a thing would be implemented? I tried > using the hep-planner with a rules-based approach, but there are some > issues with how Wayang handles join inputs i.e. left and right, and Calcite > handles inputs - Calcite uses more a crosstype based on both the rows of > the left and right input. Thanks > >