In the literature, there are many algorithms to perform a join and the vast majority of them cannot handle theta conditions (either for performance reasons, or because the algorithm was simply not meant for it).
For the existing EnumerableXXXJoin variants, it may be possible to extend the algorithm to handle theta joins without structural changes to the algorithm (as I think it is the case in [3]). For these cases, I agree with you it does not make sense to have many variants (e.g., EnumerableThetaHashJoin seems redundant). In the future we may decide to use other hash-based implementations with certain applicability restrictions (such as apply only on equijoins) but let's not worry too much about that now. Regarding the rules, I think it is useful to have different variants. Consider for example a system that does not have join processing algorithms that can treat theta-joins; pushing non-equi conditions in a join seems undesirable in this case. For more information regarding join processing, there are a few interesting (and old) surveys [4, 5] which outline some of the most typical algorithms that are used in relational databases. Best, Stamatis [4] Query evaluation techniques for Large databases ( https://web.stanford.edu/class/cs346/2014/graefe.pdf) [5] Join processing in relational databases ( https://www.csd.uoc.gr/~hy460/pdf/p63-mishra.pdf) On Tue, Apr 23, 2019 at 8:05 AM Yuzhao Chen <[email protected]> wrote: > Thx, Julian > > Why not just support non-equi join condition for every physical algorithm, > it does not make much sense if we have both HashJoin and a HashTheraJoin, > cause a HashThataJoin with empty non-equi join condition is same as a > HashJoin. > > And we can remove the limitations in the rule like FilterJoinRule. > > Best, > > Danny Chan > 在 2019年4月23日 +0800 AM3:21,[email protected],写道: > > > > If there are limitations, over time we would like to remove those > limitations, but we will probably do it by adding new algorithms, and > therefore new EnumerableXxx classes. >
