Re: Support "Right Semi Join" plan shapes

2024-04-28 Thread wenhui qiu
Hi Richard
 Thank you so much for your tireless work on this,I see the new version
of the patch improves some of the comments .I think it can commit in July


Thanks

On Thu, 25 Apr 2024 at 11:28, Richard Guo  wrote:

> Here is another rebase with a commit message to help review.  I also
> tweaked some comments.
>
> Thanks
> Richard
>


Re: Support "Right Semi Join" plan shapes

2024-04-24 Thread Richard Guo
Here is another rebase with a commit message to help review.  I also
tweaked some comments.

Thanks
Richard


v5-0001-Support-Right-Semi-Join-plan-shapes.patch
Description: Binary data


Re: Support "Right Semi Join" plan shapes

2024-03-09 Thread Alena Rybakina


On 06.03.2024 05:23, wenhui qiu wrote:



  Hi Alena Rybakina
  For merge join
  + /*
  + * For now we do not support RIGHT_SEMI join in mergejoin.
  + */
  + if (jointype == JOIN_RIGHT_SEMI)
  + {
  + *mergejoin_allowed = false;
  + return NIL;
  + }
  +
  Tanks



Yes, I see it, thank you. Sorry for the noise.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


Re: Support "Right Semi Join" plan shapes

2024-03-05 Thread wenhui qiu
Hi Alena Rybakina
For merge join
+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+
Tanks

On Wed, 6 Mar 2024 at 04:10, Alena Rybakina 
wrote:

> To be honest, I didn't see it in the code, could you tell me where they
> are, please?
> On 05.03.2024 05:44, Richard Guo wrote:
>
>
> On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina 
> wrote:
>
>> I have reviewed your patch and I think it is better to add an Assert for
>> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
>> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
>> and MergeJoin).
>
>
> Hmm, I don't see why this is necessary.  The planner should already
> guarantee that we won't have nestloops/mergejoins with right-semi joins.
>
> Thanks
> Richard
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: Support "Right Semi Join" plan shapes

2024-03-05 Thread Alena Rybakina
To be honest, I didn't see it in the code, could you tell me where they 
are, please?


On 05.03.2024 05:44, Richard Guo wrote:


On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina 
 wrote:


I have reviewed your patch and I think it is better to add an
Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections
(NestedLoop
and MergeJoin).


Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard


--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company


Re: Support "Right Semi Join" plan shapes

2024-03-04 Thread wenhui qiu
Hi Richard
 Agree +1 ,I think can push now.

Richard

On Tue, 5 Mar 2024 at 10:44, Richard Guo  wrote:

>
> On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina 
> wrote:
>
>> I have reviewed your patch and I think it is better to add an Assert for
>> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
>> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
>> and MergeJoin).
>
>
> Hmm, I don't see why this is necessary.  The planner should already
> guarantee that we won't have nestloops/mergejoins with right-semi joins.
>
> Thanks
> Richard
>


Re: Support "Right Semi Join" plan shapes

2024-03-04 Thread Richard Guo
On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina 
wrote:

> I have reviewed your patch and I think it is better to add an Assert for
> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
> and MergeJoin).


Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard


Re: Support "Right Semi Join" plan shapes

2024-03-04 Thread Richard Guo
On Mon, Mar 4, 2024 at 10:33 AM wenhui qiu  wrote:

> HI Richard
>  Now it is starting the last commitfest for v17, can you respond to
> Alena Rybakina points?
>

Thanks for reminding.  Will do that soon.

Thanks
Richard


Re: Support "Right Semi Join" plan shapes

2024-03-03 Thread wenhui qiu
HI Richard
 Now it is starting the last commitfest for v17, can you respond to
Alena Rybakina points?


Regards

On Thu, 8 Feb 2024 at 13:50, wenhui qiu  wrote:

> Hi Alena Rybakina
>  I saw this code snippet also disable mergejoin ,I think it same  effect
> + /*
> + * For now we do not support RIGHT_SEMI join in mergejoin.
> + */
> + if (jointype == JOIN_RIGHT_SEMI)
> + {
> + *mergejoin_allowed = false;
> + return NIL;
> + }
> +
>
> Regards
>
> Alena Rybakina  于2024年1月30日周二 14:51写道:
>
>> Hi! Thank you for your work on this subject.
>>
>> I have reviewed your patch and I think it is better to add an Assert for
>> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
>> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
>> and MergeJoin).
>> Mostly I'm suggesting this because of the set_join_pathlist_hook
>> function, which is in the add_paths_to_joinrel function, which allows
>> you to create a custom node. What do you think?
>>
>> --
>> Regards,
>> Alena Rybakina
>> Postgres Professional: http://www.postgrespro.com
>> The Russian Postgres Company
>>
>>


Re: Support "Right Semi Join" plan shapes

2024-02-07 Thread wenhui qiu
Hi Alena Rybakina
 I saw this code snippet also disable mergejoin ,I think it same  effect
+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+

Regards

Alena Rybakina  于2024年1月30日周二 14:51写道:

> Hi! Thank you for your work on this subject.
>
> I have reviewed your patch and I think it is better to add an Assert for
> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
> and MergeJoin).
> Mostly I'm suggesting this because of the set_join_pathlist_hook
> function, which is in the add_paths_to_joinrel function, which allows
> you to create a custom node. What do you think?
>
> --
> Regards,
> Alena Rybakina
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: Support "Right Semi Join" plan shapes

2024-01-29 Thread Alena Rybakina

Hi! Thank you for your work on this subject.

I have reviewed your patch and I think it is better to add an Assert for 
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to 
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop 
and MergeJoin).
Mostly I'm suggesting this because of the set_join_pathlist_hook 
function, which is in the add_paths_to_joinrel function, which allows 
you to create a custom node. What do you think?


--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Support "Right Semi Join" plan shapes

2024-01-23 Thread wenhui qiu
Hi  vignesh C
   Many thanks, I have marked it to  "ready for committer"

Best wish

vignesh C  于2024年1月23日周二 10:56写道:

> On Mon, 22 Jan 2024 at 11:27, wenhui qiu  wrote:
> >
> > Hi  vignesh CI saw this path has been passed (
> https://cirrus-ci.com/build/6109321080078336),can we push it?
>
> If you have found no comments from your review and testing, let's mark
> it as "ready for committer".
>
> Regards,
> Vignesh
>


Re: Support "Right Semi Join" plan shapes

2024-01-22 Thread vignesh C
On Mon, 22 Jan 2024 at 11:27, wenhui qiu  wrote:
>
> Hi  vignesh CI saw this path has been passed 
> (https://cirrus-ci.com/build/6109321080078336),can we push it?

If you have found no comments from your review and testing, let's mark
it as "ready for committer".

Regards,
Vignesh




Re: Support "Right Semi Join" plan shapes

2024-01-21 Thread wenhui qiu
Hi  vignesh CI saw this path has been passed (
https://cirrus-ci.com/build/6109321080078336),can we push it?

Best wish

Richard Guo  于2024年1月9日周二 18:49写道:

>
> On Sun, Jan 7, 2024 at 3:03 PM vignesh C  wrote:
>
>> One of the tests in CFBot has failed at [1] with:
>> -   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
>> -   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
>> r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
>> (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
>> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
>> r1."C 1" ASC NULLS LAST
>> -(4 rows)
>> +   Sort Key: t1.c1
>> +   ->  Foreign Scan
>> + Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
>> + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
>> + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
>> r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
>> EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
>> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
>> +(7 rows)
>
>
> Thanks.  I looked into it and have figured out why the plan differs.
> With this patch the SEMI JOIN that is pushed down to the remote server
> is now implemented using JOIN_RIGHT_SEMI, whereas previously it was
> implemented using JOIN_SEMI.  Consequently, this leads to changes in the
> costs of the paths: path with the sort pushed down to remote server, and
> path with the sort added atop the foreign join.  And at last the latter
> one wins by a slim margin.
>
> I think we can simply update the expected file to fix this plan diff, as
> attached.
>
> Thanks
> Richard
>


Re: Support "Right Semi Join" plan shapes

2024-01-09 Thread Richard Guo
On Sun, Jan 7, 2024 at 3:03 PM vignesh C  wrote:

> One of the tests in CFBot has failed at [1] with:
> -   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
> -   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
> r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
> (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
> r1."C 1" ASC NULLS LAST
> -(4 rows)
> +   Sort Key: t1.c1
> +   ->  Foreign Scan
> + Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
> + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
> r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
> EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
> +(7 rows)


Thanks.  I looked into it and have figured out why the plan differs.
With this patch the SEMI JOIN that is pushed down to the remote server
is now implemented using JOIN_RIGHT_SEMI, whereas previously it was
implemented using JOIN_SEMI.  Consequently, this leads to changes in the
costs of the paths: path with the sort pushed down to remote server, and
path with the sort added atop the foreign join.  And at last the latter
one wins by a slim margin.

I think we can simply update the expected file to fix this plan diff, as
attached.

Thanks
Richard


v4-0001-Support-Right-Semi-Join-plan-shapes.patch
Description: Binary data


Re: Support "Right Semi Join" plan shapes

2024-01-06 Thread vignesh C
On Wed, 1 Nov 2023 at 11:25, Richard Guo  wrote:
>
>
> On Thu, Aug 10, 2023 at 3:24 PM Richard Guo  wrote:
>>
>> The cfbot reminds that this patch does not apply any more, so rebase it
>> to v2.
>
>
> Attached is another rebase over the latest master.  Any feedback is
> appreciated.

One of the tests in CFBot has failed at [1] with:
-   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
r1."C 1" ASC NULLS LAST
-(4 rows)
+   Sort Key: t1.c1
+   ->  Foreign Scan
+ Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
+ Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+ Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
+(7 rows)

More details are available at [2].

[1] - https://cirrus-ci.com/task/4868751326183424
[2] - 
https://api.cirrus-ci.com/v1/artifact/task/4868751326183424/testrun/build/testrun/postgres_fdw/regress/regression.diffs

Regards,
Vignesh




Re: Support "Right Semi Join" plan shapes

2023-12-27 Thread wenhui qiu
Hi Richard Guo
 I did a simple test ,Subqueries of type (in) can be supported, There
is a test sql that doesn't support it, and I think that's because it can't
pull up  the subqueries.
```
test=# explain (costs off) SELECT  t1.* FROM prt1_adv t1 WHERE EXISTS
(SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: (t2.b = t1.a)
 ->  Append
   ->  Seq Scan on prt2_adv_p1 t2_1
   ->  Seq Scan on prt2_adv_p2 t2_2
   ->  Seq Scan on prt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on prt1_adv_p1 t1_1
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p2 t1_2
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p3 t1_3
   Filter: (b = 0)
(16 rows)

test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE t1.a IN
(SELECT t2.b FROM prt2_adv t2) AND t1.b = 0 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: (t2.b = t1.a)
 ->  Append
   ->  Seq Scan on prt2_adv_p1 t2_1
   ->  Seq Scan on prt2_adv_p2 t2_2
   ->  Seq Scan on prt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on prt1_adv_p1 t1_1
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p2 t1_2
   Filter: (b = 0)
 ->  Seq Scan on prt1_adv_p3 t1_3
   Filter: (b = 0)
(16 rows)

test=#

test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE EXISTS
(SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10
ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
 ->  Append
   ->  Seq Scan on plt2_adv_p1 t2_1
   ->  Seq Scan on plt2_adv_p2 t2_2
   ->  Seq Scan on plt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on plt1_adv_p1 t1_1
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p2 t1_2
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p3 t1_3
   Filter: (b < 10)
(16 rows)

test=#
test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE (t1.a, t1.c)
IN (SELECT t2.a, t2.c FROM plt2_adv t2) AND t1.b < 10 ORDER BY t1.a;
  QUERY PLAN
--
 Sort
   Sort Key: t1.a
   ->  Hash Right Semi Join
 Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
 ->  Append
   ->  Seq Scan on plt2_adv_p1 t2_1
   ->  Seq Scan on plt2_adv_p2 t2_2
   ->  Seq Scan on plt2_adv_p3 t2_3
 ->  Hash
   ->  Append
 ->  Seq Scan on plt1_adv_p1 t1_1
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p2 t1_2
   Filter: (b < 10)
 ->  Seq Scan on plt1_adv_p3 t1_3
   Filter: (b < 10)
(16 rows)


```

```
test=# explain (costs off) select * from int4_tbl i4, tenk1 a
 where exists(select * from tenk1 b
  where a.twothousand = b.twothousand and a.fivethous <>
b.fivethous)
   and i4.f1 = a.tenthous;
   QUERY PLAN
-
 Hash Right Semi Join
   Hash Cond: (b.twothousand = a.twothousand)
   Join Filter: (a.fivethous <> b.fivethous)
   ->  Seq Scan on tenk1 b
   ->  Hash
 ->  Hash Join
   Hash Cond: (a.tenthous = i4.f1)
   ->  Seq Scan on tenk1 a
   ->  Hash
 ->  Seq Scan on int4_tbl i4
(10 rows)

test=# explain (costs off ) SELECT *
FROM int4_tbl i4, tenk1 a
WHERE (a.twothousand, a.fivethous) IN (
SELECT b.twothousand, b.fivethous
FROM tenk1 b
WHERE a.twothousand = b.twothousand  and  a.fivethous <> b.fivethous
)
AND i4.f1 = a.tenthous;
   QUERY PLAN

 Nested Loop
   Join Filter: (i4.f1 = a.tenthous)
   ->  Seq Scan on tenk1 a
 Filter: (SubPlan 1)
 SubPlan 1
   ->  Seq Scan on tenk1 b
 Filter: ((a.fivethous <> fivethous) AND (a.twothousand =
twothousand))
   ->  Materialize
 ->  Seq Scan on int4_tbl i4
(9 rows)
test=# set enable_nestloop =off;
SET
test=# explain (costs off ) SELECT *

Support "Right Semi Join" plan shapes

2023-12-14 Thread wenhui qiu
Hi Richard Guo   I see that the test samples are all (exists)
subqueries ,I think semi join should also support ( in) and ( any)
subqueries. would you do more test on  ( in) and ( any) subqueries?


Best whish


Re: Support "Right Semi Join" plan shapes

2023-10-31 Thread Richard Guo
On Thu, Aug 10, 2023 at 3:24 PM Richard Guo  wrote:

> The cfbot reminds that this patch does not apply any more, so rebase it
> to v2.
>

Attached is another rebase over the latest master.  Any feedback is
appreciated.

Thanks
Richard


v3-0001-Support-Right-Semi-Join-plan-shapes.patch
Description: Binary data


Re: Support "Right Semi Join" plan shapes

2023-08-10 Thread Richard Guo
On Tue, Apr 18, 2023 at 5:07 PM Richard Guo  wrote:

> In thread [1] which discussed 'Right Anti Join', Tom once mentioned
> 'Right Semi Join'.  After a preliminary investigation I think it is
> beneficial and can be implemented with very short change.  With 'Right
> Semi Join', what we want to do is to just have the first match for each
> inner tuple.  For HashJoin, after scanning the hash bucket for matches
> to current outer, we just need to check whether the inner tuple has been
> set match and skip it if so.  For MergeJoin, we can do it by avoiding
> restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the
> case when new outer tuple == marked tuple.
>
> As that thread is already too long, fork a new thread and attach a patch
> used for discussion.  The patch implements 'Right Semi Join' for
> HashJoin.
>

The cfbot reminds that this patch does not apply any more, so rebase it
to v2.

Thanks
Richard


v2-0001-Support-Right-Semi-Join-plan-shapes.patch
Description: Binary data


Support "Right Semi Join" plan shapes

2023-04-18 Thread Richard Guo
In thread [1] which discussed 'Right Anti Join', Tom once mentioned
'Right Semi Join'.  After a preliminary investigation I think it is
beneficial and can be implemented with very short change.  With 'Right
Semi Join', what we want to do is to just have the first match for each
inner tuple.  For HashJoin, after scanning the hash bucket for matches
to current outer, we just need to check whether the inner tuple has been
set match and skip it if so.  For MergeJoin, we can do it by avoiding
restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the
case when new outer tuple == marked tuple.

As that thread is already too long, fork a new thread and attach a patch
used for discussion.  The patch implements 'Right Semi Join' for
HashJoin.

[1]
https://www.postgresql.org/message-id/CAMbWs4_eChX1bN%3Dvj0Uzg_7iz9Uivan%2BWjjor-X87L-V27A%2Brw%40mail.gmail.com

Thanks
Richard


v1-0001-Support-Right-Semi-Join-plan-shapes.patch
Description: Binary data