Re: Using each rel as both outer and inner for JOIN_ANTI

Richard Guo Mon, 10 Apr 2023 02:21:33 -0700

On Fri, Apr 7, 2023 at 3:28 PM Richard Guo <[email protected]> wrote:


> On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <[email protected]> wrote:
>
>> On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <[email protected]> wrote:
>>
>>> [ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]
>>
>> Maybe this is something we can do. Currently for the query below:
>>
>> # explain select * from foo where a in (select c from bar);
>>                                QUERY PLAN
>> -------------------------------------------------------------------------
>>  Hash Semi Join  (cost=154156.00..173691.29 rows=10 width=8)
>>    Hash Cond: (foo.a = bar.c)
>>    ->  Seq Scan on foo  (cost=0.00..1.10 rows=10 width=8)
>>    ->  Hash  (cost=72124.00..72124.00 rows=5000000 width=4)
>>          ->  Seq Scan on bar  (cost=0.00..72124.00 rows=5000000 width=4)
>> (5 rows)
>>
>> I believe we can get a cheaper plan if we are able to swap the outer and
>> inner for SEMI JOIN and use the smaller 'foo' as inner rel.
>>
> It may not be easy for MergeJoin and NestLoop though, as we do not have
> a way to know if an inner tuple has been already matched or not.  But
> the benefit of swapping inputs for MergeJoin and NestLoop seems to be
> small, so I think it's OK to ignore them.
>

Hmm.  Actually we can do it for MergeJoin by avoiding restoring inner
scan to the marked tuple in EXEC_MJ_TESTOUTER, in the case when new
outer tuple == marked tuple.  But I'm not sure how much benefit we can
get from Merge Right Semi Join.

For HashJoin, though, there are cases that can surely benefit from Hash
Right Semi Join.  So I go ahead and have a try on it as attached.

Thanks
Richard

v1-0001-Support-Right-Semi-Join.patch
Description: Binary data

Re: Using each rel as both outer and inner for JOIN_ANTI

Reply via email to