I am adding rules to make semi-joins more practical [1] [2] [3]. Which side of a semi-join is conventionally the “build” side? Right now it’s the right. For example, the query
select * from dept where deptno in (select deptno from emp where gender =‘F’) becomes SemiJoin(Scan(dept), Filter(Scan(emp), gender = ‘F’), 0.deptno = 1.deptno) Filter(Scan(emp), gender = ‘F’) is the “build” side, and you can imagine implementing it by populating a hash-table with distinct keys before starting to read from the “dept” table. If we’re generating left-deep trees, we put the smaller input on the left-hand side. So, should the “build” side of a semi-join go on the left also? Julian [1] https://issues.apache.org/jira/browse/OPTIQ-367 [2] https://issues.apache.org/jira/browse/OPTIQ-368 [3] https://issues.apache.org/jira/browse/OPTIQ-369
