On Fri, Apr 7, 2023 at 3:28 PM Richard Guo <guofengli...@gmail.com> wrote:
> On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofengli...@gmail.com> wrote: > >> On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <t...@sss.pgh.pa.us> wrote: >> >>> [ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ] >> >> Maybe this is something we can do. Currently for the query below: >> >> # explain select * from foo where a in (select c from bar); >> QUERY PLAN >> ------------------------------------------------------------------------- >> Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8) >> Hash Cond: (foo.a = bar.c) >> -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8) >> -> Hash (cost=72124.00..72124.00 rows=5000000 width=4) >> -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4) >> (5 rows) >> >> I believe we can get a cheaper plan if we are able to swap the outer and >> inner for SEMI JOIN and use the smaller 'foo' as inner rel. >> > It may not be easy for MergeJoin and NestLoop though, as we do not have > a way to know if an inner tuple has been already matched or not. But > the benefit of swapping inputs for MergeJoin and NestLoop seems to be > small, so I think it's OK to ignore them. > Hmm. Actually we can do it for MergeJoin by avoiding restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the case when new outer tuple == marked tuple. But I'm not sure how much benefit we can get from Merge Right Semi Join. For HashJoin, though, there are cases that can surely benefit from Hash Right Semi Join. So I go ahead and have a try on it as attached. Thanks Richard
v1-0001-Support-Right-Semi-Join.patch
Description: Binary data