Re: Support "Right Semi Join" plan shapes
Hi Richard Thank you so much for your tireless work on this,I see the new version of the patch improves some of the comments .I think it can commit in July Thanks On Thu, 25 Apr 2024 at 11:28, Richard Guo wrote: > Here is another rebase with a commit message to help review. I also > tweaked some comments. > > Thanks > Richard >
Re: Support "Right Semi Join" plan shapes
Here is another rebase with a commit message to help review. I also tweaked some comments. Thanks Richard v5-0001-Support-Right-Semi-Join-plan-shapes.patch Description: Binary data
Re: Support "Right Semi Join" plan shapes
On 06.03.2024 05:23, wenhui qiu wrote: Hi Alena Rybakina For merge join + /* + * For now we do not support RIGHT_SEMI join in mergejoin. + */ + if (jointype == JOIN_RIGHT_SEMI) + { + *mergejoin_allowed = false; + return NIL; + } + Tanks Yes, I see it, thank you. Sorry for the noise. -- Regards, Alena Rybakina Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
Re: Support "Right Semi Join" plan shapes
Hi Alena Rybakina For merge join + /* + * For now we do not support RIGHT_SEMI join in mergejoin. + */ + if (jointype == JOIN_RIGHT_SEMI) + { + *mergejoin_allowed = false; + return NIL; + } + Tanks On Wed, 6 Mar 2024 at 04:10, Alena Rybakina wrote: > To be honest, I didn't see it in the code, could you tell me where they > are, please? > On 05.03.2024 05:44, Richard Guo wrote: > > > On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina > wrote: > >> I have reviewed your patch and I think it is better to add an Assert for >> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to >> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop >> and MergeJoin). > > > Hmm, I don't see why this is necessary. The planner should already > guarantee that we won't have nestloops/mergejoins with right-semi joins. > > Thanks > Richard > > -- > Regards, > Alena Rybakina > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > >
Re: Support "Right Semi Join" plan shapes
To be honest, I didn't see it in the code, could you tell me where they are, please? On 05.03.2024 05:44, Richard Guo wrote: On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina wrote: I have reviewed your patch and I think it is better to add an Assert for JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to prevent the use of RIGHT_SEMI for these types of connections (NestedLoop and MergeJoin). Hmm, I don't see why this is necessary. The planner should already guarantee that we won't have nestloops/mergejoins with right-semi joins. Thanks Richard -- Regards, Alena Rybakina Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
Re: Support "Right Semi Join" plan shapes
Hi Richard Agree +1 ,I think can push now. Richard On Tue, 5 Mar 2024 at 10:44, Richard Guo wrote: > > On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina > wrote: > >> I have reviewed your patch and I think it is better to add an Assert for >> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to >> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop >> and MergeJoin). > > > Hmm, I don't see why this is necessary. The planner should already > guarantee that we won't have nestloops/mergejoins with right-semi joins. > > Thanks > Richard >
Re: Support "Right Semi Join" plan shapes
On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina wrote: > I have reviewed your patch and I think it is better to add an Assert for > JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to > prevent the use of RIGHT_SEMI for these types of connections (NestedLoop > and MergeJoin). Hmm, I don't see why this is necessary. The planner should already guarantee that we won't have nestloops/mergejoins with right-semi joins. Thanks Richard
Re: Support "Right Semi Join" plan shapes
On Mon, Mar 4, 2024 at 10:33 AM wenhui qiu wrote: > HI Richard > Now it is starting the last commitfest for v17, can you respond to > Alena Rybakina points? > Thanks for reminding. Will do that soon. Thanks Richard
Re: Support "Right Semi Join" plan shapes
HI Richard Now it is starting the last commitfest for v17, can you respond to Alena Rybakina points? Regards On Thu, 8 Feb 2024 at 13:50, wenhui qiu wrote: > Hi Alena Rybakina > I saw this code snippet also disable mergejoin ,I think it same effect > + /* > + * For now we do not support RIGHT_SEMI join in mergejoin. > + */ > + if (jointype == JOIN_RIGHT_SEMI) > + { > + *mergejoin_allowed = false; > + return NIL; > + } > + > > Regards > > Alena Rybakina 于2024年1月30日周二 14:51写道: > >> Hi! Thank you for your work on this subject. >> >> I have reviewed your patch and I think it is better to add an Assert for >> JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to >> prevent the use of RIGHT_SEMI for these types of connections (NestedLoop >> and MergeJoin). >> Mostly I'm suggesting this because of the set_join_pathlist_hook >> function, which is in the add_paths_to_joinrel function, which allows >> you to create a custom node. What do you think? >> >> -- >> Regards, >> Alena Rybakina >> Postgres Professional: http://www.postgrespro.com >> The Russian Postgres Company >> >>
Re: Support "Right Semi Join" plan shapes
Hi Alena Rybakina I saw this code snippet also disable mergejoin ,I think it same effect + /* + * For now we do not support RIGHT_SEMI join in mergejoin. + */ + if (jointype == JOIN_RIGHT_SEMI) + { + *mergejoin_allowed = false; + return NIL; + } + Regards Alena Rybakina 于2024年1月30日周二 14:51写道: > Hi! Thank you for your work on this subject. > > I have reviewed your patch and I think it is better to add an Assert for > JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to > prevent the use of RIGHT_SEMI for these types of connections (NestedLoop > and MergeJoin). > Mostly I'm suggesting this because of the set_join_pathlist_hook > function, which is in the add_paths_to_joinrel function, which allows > you to create a custom node. What do you think? > > -- > Regards, > Alena Rybakina > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > >
Re: Support "Right Semi Join" plan shapes
Hi! Thank you for your work on this subject. I have reviewed your patch and I think it is better to add an Assert for JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to prevent the use of RIGHT_SEMI for these types of connections (NestedLoop and MergeJoin). Mostly I'm suggesting this because of the set_join_pathlist_hook function, which is in the add_paths_to_joinrel function, which allows you to create a custom node. What do you think? -- Regards, Alena Rybakina Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Support "Right Semi Join" plan shapes
Hi vignesh C Many thanks, I have marked it to "ready for committer" Best wish vignesh C 于2024年1月23日周二 10:56写道: > On Mon, 22 Jan 2024 at 11:27, wenhui qiu wrote: > > > > Hi vignesh CI saw this path has been passed ( > https://cirrus-ci.com/build/6109321080078336),can we push it? > > If you have found no comments from your review and testing, let's mark > it as "ready for committer". > > Regards, > Vignesh >
Re: Support "Right Semi Join" plan shapes
On Mon, 22 Jan 2024 at 11:27, wenhui qiu wrote: > > Hi vignesh CI saw this path has been passed > (https://cirrus-ci.com/build/6109321080078336),can we push it? If you have found no comments from your review and testing, let's mark it as "ready for committer". Regards, Vignesh
Re: Support "Right Semi Join" plan shapes
Hi vignesh CI saw this path has been passed ( https://cirrus-ci.com/build/6109321080078336),can we push it? Best wish Richard Guo 于2024年1月9日周二 18:49写道: > > On Sun, Jan 7, 2024 at 3:03 PM vignesh C wrote: > >> One of the tests in CFBot has failed at [1] with: >> - Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) >> - Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, >> r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS >> (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND >> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY >> r1."C 1" ASC NULLS LAST >> -(4 rows) >> + Sort Key: t1.c1 >> + -> Foreign Scan >> + Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 >> + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) >> + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, >> r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND >> EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND >> ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) >> +(7 rows) > > > Thanks. I looked into it and have figured out why the plan differs. > With this patch the SEMI JOIN that is pushed down to the remote server > is now implemented using JOIN_RIGHT_SEMI, whereas previously it was > implemented using JOIN_SEMI. Consequently, this leads to changes in the > costs of the paths: path with the sort pushed down to remote server, and > path with the sort added atop the foreign join. And at last the latter > one wins by a slim margin. > > I think we can simply update the expected file to fix this plan diff, as > attached. > > Thanks > Richard >
Re: Support "Right Semi Join" plan shapes
On Sun, Jan 7, 2024 at 3:03 PM vignesh C wrote: > One of the tests in CFBot has failed at [1] with: > - Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) > - Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, > r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS > (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND > ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY > r1."C 1" ASC NULLS LAST > -(4 rows) > + Sort Key: t1.c1 > + -> Foreign Scan > + Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 > + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) > + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, > r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND > EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND > ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) > +(7 rows) Thanks. I looked into it and have figured out why the plan differs. With this patch the SEMI JOIN that is pushed down to the remote server is now implemented using JOIN_RIGHT_SEMI, whereas previously it was implemented using JOIN_SEMI. Consequently, this leads to changes in the costs of the paths: path with the sort pushed down to remote server, and path with the sort added atop the foreign join. And at last the latter one wins by a slim margin. I think we can simply update the expected file to fix this plan diff, as attached. Thanks Richard v4-0001-Support-Right-Semi-Join-plan-shapes.patch Description: Binary data
Re: Support "Right Semi Join" plan shapes
On Wed, 1 Nov 2023 at 11:25, Richard Guo wrote: > > > On Thu, Aug 10, 2023 at 3:24 PM Richard Guo wrote: >> >> The cfbot reminds that this patch does not apply any more, so rebase it >> to v2. > > > Attached is another rebase over the latest master. Any feedback is > appreciated. One of the tests in CFBot has failed at [1] with: - Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) - Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY r1."C 1" ASC NULLS LAST -(4 rows) + Sort Key: t1.c1 + -> Foreign Scan + Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 + Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2) + Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND ((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) +(7 rows) More details are available at [2]. [1] - https://cirrus-ci.com/task/4868751326183424 [2] - https://api.cirrus-ci.com/v1/artifact/task/4868751326183424/testrun/build/testrun/postgres_fdw/regress/regression.diffs Regards, Vignesh
Re: Support "Right Semi Join" plan shapes
Hi Richard Guo I did a simple test ,Subqueries of type (in) can be supported, There is a test sql that doesn't support it, and I think that's because it can't pull up the subqueries. ``` test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE EXISTS (SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a; QUERY PLAN -- Sort Sort Key: t1.a -> Hash Right Semi Join Hash Cond: (t2.b = t1.a) -> Append -> Seq Scan on prt2_adv_p1 t2_1 -> Seq Scan on prt2_adv_p2 t2_2 -> Seq Scan on prt2_adv_p3 t2_3 -> Hash -> Append -> Seq Scan on prt1_adv_p1 t1_1 Filter: (b = 0) -> Seq Scan on prt1_adv_p2 t1_2 Filter: (b = 0) -> Seq Scan on prt1_adv_p3 t1_3 Filter: (b = 0) (16 rows) test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE t1.a IN (SELECT t2.b FROM prt2_adv t2) AND t1.b = 0 ORDER BY t1.a; QUERY PLAN -- Sort Sort Key: t1.a -> Hash Right Semi Join Hash Cond: (t2.b = t1.a) -> Append -> Seq Scan on prt2_adv_p1 t2_1 -> Seq Scan on prt2_adv_p2 t2_2 -> Seq Scan on prt2_adv_p3 t2_3 -> Hash -> Append -> Seq Scan on prt1_adv_p1 t1_1 Filter: (b = 0) -> Seq Scan on prt1_adv_p2 t1_2 Filter: (b = 0) -> Seq Scan on prt1_adv_p3 t1_3 Filter: (b = 0) (16 rows) test=# test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE EXISTS (SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10 ORDER BY t1.a; QUERY PLAN -- Sort Sort Key: t1.a -> Hash Right Semi Join Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c)) -> Append -> Seq Scan on plt2_adv_p1 t2_1 -> Seq Scan on plt2_adv_p2 t2_2 -> Seq Scan on plt2_adv_p3 t2_3 -> Hash -> Append -> Seq Scan on plt1_adv_p1 t1_1 Filter: (b < 10) -> Seq Scan on plt1_adv_p2 t1_2 Filter: (b < 10) -> Seq Scan on plt1_adv_p3 t1_3 Filter: (b < 10) (16 rows) test=# test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE (t1.a, t1.c) IN (SELECT t2.a, t2.c FROM plt2_adv t2) AND t1.b < 10 ORDER BY t1.a; QUERY PLAN -- Sort Sort Key: t1.a -> Hash Right Semi Join Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c)) -> Append -> Seq Scan on plt2_adv_p1 t2_1 -> Seq Scan on plt2_adv_p2 t2_2 -> Seq Scan on plt2_adv_p3 t2_3 -> Hash -> Append -> Seq Scan on plt1_adv_p1 t1_1 Filter: (b < 10) -> Seq Scan on plt1_adv_p2 t1_2 Filter: (b < 10) -> Seq Scan on plt1_adv_p3 t1_3 Filter: (b < 10) (16 rows) ``` ``` test=# explain (costs off) select * from int4_tbl i4, tenk1 a where exists(select * from tenk1 b where a.twothousand = b.twothousand and a.fivethous <> b.fivethous) and i4.f1 = a.tenthous; QUERY PLAN - Hash Right Semi Join Hash Cond: (b.twothousand = a.twothousand) Join Filter: (a.fivethous <> b.fivethous) -> Seq Scan on tenk1 b -> Hash -> Hash Join Hash Cond: (a.tenthous = i4.f1) -> Seq Scan on tenk1 a -> Hash -> Seq Scan on int4_tbl i4 (10 rows) test=# explain (costs off ) SELECT * FROM int4_tbl i4, tenk1 a WHERE (a.twothousand, a.fivethous) IN ( SELECT b.twothousand, b.fivethous FROM tenk1 b WHERE a.twothousand = b.twothousand and a.fivethous <> b.fivethous ) AND i4.f1 = a.tenthous; QUERY PLAN Nested Loop Join Filter: (i4.f1 = a.tenthous) -> Seq Scan on tenk1 a Filter: (SubPlan 1) SubPlan 1 -> Seq Scan on tenk1 b Filter: ((a.fivethous <> fivethous) AND (a.twothousand = twothousand)) -> Materialize -> Seq Scan on int4_tbl i4 (9 rows) test=# set enable_nestloop =off; SET test=# explain (costs off ) SELECT *
Support "Right Semi Join" plan shapes
Hi Richard Guo I see that the test samples are all (exists) subqueries ,I think semi join should also support ( in) and ( any) subqueries. would you do more test on ( in) and ( any) subqueries? Best whish
Re: Support "Right Semi Join" plan shapes
On Thu, Aug 10, 2023 at 3:24 PM Richard Guo wrote: > The cfbot reminds that this patch does not apply any more, so rebase it > to v2. > Attached is another rebase over the latest master. Any feedback is appreciated. Thanks Richard v3-0001-Support-Right-Semi-Join-plan-shapes.patch Description: Binary data
Re: Support "Right Semi Join" plan shapes
On Tue, Apr 18, 2023 at 5:07 PM Richard Guo wrote: > In thread [1] which discussed 'Right Anti Join', Tom once mentioned > 'Right Semi Join'. After a preliminary investigation I think it is > beneficial and can be implemented with very short change. With 'Right > Semi Join', what we want to do is to just have the first match for each > inner tuple. For HashJoin, after scanning the hash bucket for matches > to current outer, we just need to check whether the inner tuple has been > set match and skip it if so. For MergeJoin, we can do it by avoiding > restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the > case when new outer tuple == marked tuple. > > As that thread is already too long, fork a new thread and attach a patch > used for discussion. The patch implements 'Right Semi Join' for > HashJoin. > The cfbot reminds that this patch does not apply any more, so rebase it to v2. Thanks Richard v2-0001-Support-Right-Semi-Join-plan-shapes.patch Description: Binary data
Support "Right Semi Join" plan shapes
In thread [1] which discussed 'Right Anti Join', Tom once mentioned 'Right Semi Join'. After a preliminary investigation I think it is beneficial and can be implemented with very short change. With 'Right Semi Join', what we want to do is to just have the first match for each inner tuple. For HashJoin, after scanning the hash bucket for matches to current outer, we just need to check whether the inner tuple has been set match and skip it if so. For MergeJoin, we can do it by avoiding restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the case when new outer tuple == marked tuple. As that thread is already too long, fork a new thread and attach a patch used for discussion. The patch implements 'Right Semi Join' for HashJoin. [1] https://www.postgresql.org/message-id/CAMbWs4_eChX1bN%3Dvj0Uzg_7iz9Uivan%2BWjjor-X87L-V27A%2Brw%40mail.gmail.com Thanks Richard v1-0001-Support-Right-Semi-Join-plan-shapes.patch Description: Binary data