I was reading a paper on null-aware anti-joins (section 6 in [1]) and noticed that they use <> ALL to represent NOT IN semantics. We currently execute ALL SubLinks using a nested-loop SubPlan. This made me realize that it could be beneficial to convert ALL SubLinks to ANY SubLinks.
According to De Morgan's laws: foo op ALL (sub-SELECT) => NOT (foo negator_op ANY (sub-SELECT)) NOT (foo op ALL (sub-SELECT)) => foo negator_op ANY (sub-SELECT) This conversion makes it possible for the executor to evaluate the sublink using a hashed SubPlan, which replaces a nested loop with a more efficient hash probe. Even better, because our planner already knows how to pull up ANY SubLinks (and pending the other patch, NOT ANY SubLinks), these transformed clauses have the chance to be optimized directly into semijoins or anti-joins. In the worst-case scenario where the transformed ANY sublink cannot be pulled up and cannot be hashed, execution falls back to a nested-loop SubPlan. Performance in this scenario is effectively identical to the legacy ALL SubPlan, so there should be no regressions. Because the operator is negated, the ANY subplan will short-circuit on the exact same inner tuple that the ALL subplan would have short-circuited on. The only added overhead is a single boolean NOT inversion per outer tuple, which is, IMO, computationally negligible compared to the cost of the nested-loop execution itself. Attached is a draft patch illustrating the idea. Any thoughts or interest in this transformation? [1] http://www.vldb.org/pvldb/vol2/vldb09-423.pdf - Richard
v1-0001-Convert-ALL-SubLinks-to-ANY-SubLinks.patch
Description: Binary data
