Why wouldn't costing alone solve this? It seems like the Join operator could understand these types of things enough to have a smarter costing function.
On Fri, Jul 24, 2015 at 11:47 AM, Aman Sinha <[email protected]> wrote: > In many situations we want to have some kind of join 'affinity' for a pair > of tables T1, T2 where these 2 tables must be joined to each other before > being joined to any other table. > > Two such situations are: > 1. SELECT * FROM t1, t2 WHERE t1.a1 IN (1, 2, 3.....thousands) AND t1.b1 > = t2.b2 > Here, the IN list can be represented by a VALUES operator and exposed > as a table > to be joined to t1. It is important to do this join first before > joining t1 to t2. > - This might be addressed by doing a SemiJoin instead of inner > join. Would that > be the right approach ? > > 2. SELECT * FROM t1, t1_dimension, t2 WHERE .... > Here, I want to join t1 and t1_dimension before joining t1 with t2. > The reason is > t1_dimension has a column that is the primary key of t1, so if there > is a filter on a > dimension column, I can get the corresponding primary keys and do the > join with t1. > One can think of t1_dimension as basically an index table for t1. > - For this case, I don't want to rely on costing since the join > planning may not > necessarily pick the join order I want. Is there a good way to > achieve this in > Calcite ? > > thanks, > Aman >
