In many situations we want to have some kind of join 'affinity' for a pair
of tables T1, T2 where these 2 tables must be joined to each other before
being joined to any other table.
Two such situations are:
1. SELECT * FROM t1, t2 WHERE t1.a1 IN (1, 2, 3.....thousands) AND t1.b1
= t2.b2
Here, the IN list can be represented by a VALUES operator and exposed
as a table
to be joined to t1. It is important to do this join first before
joining t1 to t2.
- This might be addressed by doing a SemiJoin instead of inner
join. Would that
be the right approach ?
2. SELECT * FROM t1, t1_dimension, t2 WHERE ....
Here, I want to join t1 and t1_dimension before joining t1 with t2.
The reason is
t1_dimension has a column that is the primary key of t1, so if there
is a filter on a
dimension column, I can get the corresponding primary keys and do the
join with t1.
One can think of t1_dimension as basically an index table for t1.
- For this case, I don't want to rely on costing since the join
planning may not
necessarily pick the join order I want. Is there a good way to
achieve this in
Calcite ?
thanks,
Aman