Re: Add semi-join pushdown to postgres_fdw

2024-02-12 Thread Pavel Luzanov

Hi, Alexander!

On 12.02.2024 05:27, Alexander Korotkov wrote:

But the worst thing is that replacing AND with OR causes breaking session and 
server restart:

I haven't managed to reproduce this yet.  Could you give more details:
machine, OS, compile options, backtrace?


We already had off-list conversation with Alexander Pyhalov.

Yesterday, after rebuilding the server, I can't reproduce the error.
I have good reason to believe that the problem was on my side.
On Friday, I tested another patch and built the server several times.
Most likely, I just made a mistake during the server build.

Sorry for the noise.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com


Re: Add semi-join pushdown to postgres_fdw

2024-02-11 Thread Alexander Korotkov
Hi, Pavel!


On Fri, Feb 9, 2024 at 10:08 PM Pavel Luzanov  wrote:
> But optimization not used for NOT EXISTS:

Right, anti-joins are not supported yet.

> Also, optimization not used after deleting first condition (a.aircraft_code = 
> '320'):

This is a costing issue.  Optimization worlds for me when set
"use_remote_estimate = true" for the server;

> But the worst thing is that replacing AND with OR causes breaking session and 
> server restart:

I haven't managed to reproduce this yet.  Could you give more details:
machine, OS, compile options, backtrace?

--
Regards,
Alexander Korotkov




Re: Add semi-join pushdown to postgres_fdw

2024-02-09 Thread Alexander Korotkov
On Fri, Feb 9, 2024 at 10:08 PM Pavel Luzanov  wrote:
> While playing with this feature I found the following.
>
> Two foreign tables:
> postgres@demo_postgres_fdw(17.0)=# \det aircrafts|seats
>   List of foreign tables
>  Schema |   Table   |   Server
> +---+-
>  public | aircrafts | demo_server
>  public | seats | demo_server
> (2 rows)
>
>
> This query uses optimization:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' AND EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>   
>QUERY PLAN 
>   >
> -->
>  Foreign Scan
>Output: a.aircraft_code, a.model, a.range
>Relations: (public.aircrafts a) SEMI JOIN (public.seats s)
>Remote SQL: SELECT r1.aircraft_code, r1.model, r1.range FROM 
> bookings.aircrafts r1 WHERE ((r1.aircraft_code = '320')) AND EXISTS (SELECT 
> NULL FROM bookings.seats r2 WHERE ((r2.aircraft_code =>
> (4 rows)
>
>
> But optimization not used for NOT EXISTS:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' AND NOT EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>QUERY PLAN
> 
>  Nested Loop Anti Join
>Output: a.aircraft_code, a.model, a.range
>->  Foreign Scan on public.aircrafts a
>  Output: a.aircraft_code, a.model, a.range
>  Remote SQL: SELECT aircraft_code, model, range FROM 
> bookings.aircrafts WHERE ((aircraft_code = '320'))
>->  Materialize
>  Output: s.aircraft_code
>  ->  Foreign Scan on public.seats s
>Output: s.aircraft_code
>Remote SQL: SELECT aircraft_code FROM bookings.seats WHERE 
> ((aircraft_code = '320'))
> (10 rows)
>
> Also, optimization not used after deleting first condition (a.aircraft_code = 
> '320'):
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
>QUERY PLAN
> 
>  Hash Join
>Output: a.aircraft_code, a.model, a.range
>Inner Unique: true
>Hash Cond: (a.aircraft_code = s.aircraft_code)
>->  Foreign Scan on public.aircrafts a
>  Output: a.aircraft_code, a.model, a.range
>  Remote SQL: SELECT aircraft_code, model, range FROM 
> bookings.aircrafts
>->  Hash
>  Output: s.aircraft_code
>  ->  HashAggregate
>Output: s.aircraft_code
>Group Key: s.aircraft_code
>->  Foreign Scan on public.seats s
>  Output: s.aircraft_code
>  Remote SQL: SELECT aircraft_code FROM bookings.seats
> (15 rows)
>
>
> But the worst thing is that replacing AND with OR causes breaking session and 
> server restart:
>
> postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
> FROM aircrafts a
> WHERE a.aircraft_code = '320' OR EXISTS (
>   SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
> );
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

Thank you, Pavel. I'm looking into this.

--
Regards,
Alexander Korotkov




Re: Add semi-join pushdown to postgres_fdw

2024-02-09 Thread Pavel Luzanov

Hello,

While playing with this feature I found the following.

Two foreign tables:
postgres@demo_postgres_fdw(17.0)=# \det aircrafts|seats
  List of foreign tables
 Schema |   Table   |   Server
+---+-
 public | aircrafts | demo_server
 public | seats | demo_server
(2 rows)


This query uses optimization:

postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
FROM aircrafts a
WHERE a.aircraft_code = '320' AND EXISTS (
  SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
);
   
  QUERY PLAN   
>
-->
 Foreign Scan
   Output: a.aircraft_code, a.model, a.range
   Relations: (public.aircrafts a) SEMI JOIN (public.seats s)
   Remote SQL: SELECT r1.aircraft_code, r1.model, r1.range FROM bookings.aircrafts 
r1 WHERE ((r1.aircraft_code = '320')) AND EXISTS (SELECT NULL FROM bookings.seats 
r2 WHERE ((r2.aircraft_code =>
(4 rows)


But optimization not used for NOT EXISTS:

postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
FROM aircrafts a
WHERE a.aircraft_code = '320' AND NOT EXISTS (
  SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
);
   QUERY PLAN

 Nested Loop Anti Join
   Output: a.aircraft_code, a.model, a.range
   ->  Foreign Scan on public.aircrafts a
 Output: a.aircraft_code, a.model, a.range
 Remote SQL: SELECT aircraft_code, model, range FROM bookings.aircrafts 
WHERE ((aircraft_code = '320'))
   ->  Materialize
 Output: s.aircraft_code
 ->  Foreign Scan on public.seats s
   Output: s.aircraft_code
   Remote SQL: SELECT aircraft_code FROM bookings.seats WHERE 
((aircraft_code = '320'))
(10 rows)

Also, optimization not used after deleting first condition (a.aircraft_code = 
'320'):

postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
FROM aircrafts a
WHERE EXISTS (
  SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
);
   QUERY PLAN

 Hash Join
   Output: a.aircraft_code, a.model, a.range
   Inner Unique: true
   Hash Cond: (a.aircraft_code = s.aircraft_code)
   ->  Foreign Scan on public.aircrafts a
 Output: a.aircraft_code, a.model, a.range
 Remote SQL: SELECT aircraft_code, model, range FROM bookings.aircrafts
   ->  Hash
 Output: s.aircraft_code
 ->  HashAggregate
   Output: s.aircraft_code
   Group Key: s.aircraft_code
   ->  Foreign Scan on public.seats s
 Output: s.aircraft_code
 Remote SQL: SELECT aircraft_code FROM bookings.seats
(15 rows)


But the worst thing is that replacing AND with OR causes breaking session and 
server restart:

postgres@demo_postgres_fdw(17.0)=# EXPLAIN (costs off, verbose) SELECT *
FROM aircrafts a
WHERE a.aircraft_code = '320' OR EXISTS (
  SELECT * FROM seats s WHERE s.aircraft_code = a.aircraft_code
);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com


Re: Add semi-join pushdown to postgres_fdw

2023-12-05 Thread Alexander Pyhalov

Alexander Korotkov писал(а) 2023-12-03 23:52:

Hi, Alexander!

On Mon, Nov 27, 2023 at 5:11 PM Alexander Pyhalov
 wrote:

Alexander Korotkov писал(а) 2023-11-27 03:49:

> Thank you for the revision.
>
> I've revised the patch myself.  I've replaced StringInfo with
> additional conds into a list of strings as I proposed before.  I think
> the code became much clearer.  Also, it gets rid of some unnecessary
> allocations.
>
> I think the code itself is not in bad shape.  But patch lacks some
> high-level description of semi-joins processing as well as comments on
> each manipulation with additional conds.  Could you please add this?
>

Hi. The updated patch looks better. It seems I've failed to fix logic 
in

deparseFromExprForRel() when tried to convert StringInfos to Lists.

I've added some comments. The most complete description of how 
SEMI-JOIN

is processed, is located in deparseFromExprForRel(). Unfortunately,
there seems to be no single place, describing current JOIN deparsing
logic.


Looks good to me. I've made some grammar and formatting adjustments.
Also, I've written the commit message.

Now, I think this looks good.  I'm going to push this if no objections.

--
Regards,
Alexander Korotkov


Hi. No objections from my side.

Perhaps, some rephrasing is needed in comment in semijoin_target_ok():

"The planner can create semi-joins, which refer to inner rel
vars in its target list."

Perhaps, change "semi-joins, which refer" to "a semi-join, which refers 
...",

as later we speak about "its" target list.

--
Best regards,
Alexander Pyhalov,
Postgres Professional




Re: Add semi-join pushdown to postgres_fdw

2023-12-03 Thread Alexander Korotkov
Hi, Alexander!

On Mon, Nov 27, 2023 at 5:11 PM Alexander Pyhalov
 wrote:
> Alexander Korotkov писал(а) 2023-11-27 03:49:
>
> > Thank you for the revision.
> >
> > I've revised the patch myself.  I've replaced StringInfo with
> > additional conds into a list of strings as I proposed before.  I think
> > the code became much clearer.  Also, it gets rid of some unnecessary
> > allocations.
> >
> > I think the code itself is not in bad shape.  But patch lacks some
> > high-level description of semi-joins processing as well as comments on
> > each manipulation with additional conds.  Could you please add this?
> >
>
> Hi. The updated patch looks better. It seems I've failed to fix logic in
> deparseFromExprForRel() when tried to convert StringInfos to Lists.
>
> I've added some comments. The most complete description of how SEMI-JOIN
> is processed, is located in deparseFromExprForRel(). Unfortunately,
> there seems to be no single place, describing current JOIN deparsing
> logic.

Looks good to me. I've made some grammar and formatting adjustments.
Also, I've written the commit message.

Now, I think this looks good.  I'm going to push this if no objections.

--
Regards,
Alexander Korotkov


0001-Add-support-for-deparsing-semi-joins-to-contrib-p-v8.patch
Description: Binary data


Re: Add semi-join pushdown to postgres_fdw

2023-11-27 Thread Alexander Pyhalov

Alexander Korotkov писал(а) 2023-11-27 03:49:


Thank you for the revision.

I've revised the patch myself.  I've replaced StringInfo with
additional conds into a list of strings as I proposed before.  I think
the code became much clearer.  Also, it gets rid of some unnecessary
allocations.

I think the code itself is not in bad shape.  But patch lacks some
high-level description of semi-joins processing as well as comments on
each manipulation with additional conds.  Could you please add this?



Hi. The updated patch looks better. It seems I've failed to fix logic in 
deparseFromExprForRel() when tried to convert StringInfos to Lists.


I've added some comments. The most complete description of how SEMI-JOIN 
is processed, is located in deparseFromExprForRel(). Unfortunately,
there seems to be no single place, describing current JOIN deparsing 
logic.


--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom c17e05d09d5721d22785ed11bed053162d67d967 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Mon, 27 Nov 2023 14:35:29 +0300
Subject: [PATCH] postgres_fdw: add support for deparsing semi joins

---
 contrib/postgres_fdw/deparse.c| 234 ++---
 .../postgres_fdw/expected/postgres_fdw.out| 320 --
 contrib/postgres_fdw/postgres_fdw.c   |  94 -
 contrib/postgres_fdw/postgres_fdw.h   |   3 +
 contrib/postgres_fdw/sql/postgres_fdw.sql | 126 ++-
 5 files changed, 696 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 09fd489a901..8670524578b 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -180,11 +180,15 @@ static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
   RelOptInfo *foreignrel, bool use_alias,
   Index ignore_rel, List **ignore_conds,
+  List **additional_conds,
   List **params_list);
+static void appendWhereClause(List *exprs, List *additional_conds,
+			  deparse_expr_cxt *context);
 static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
 static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 			   RelOptInfo *foreignrel, bool make_subquery,
-			   Index ignore_rel, List **ignore_conds, List **params_list);
+			   Index ignore_rel, List **ignore_conds,
+			   List **additional_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
 static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
@@ -1370,6 +1374,7 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	RelOptInfo *scanrel = context->scanrel;
+	List	   *additional_conds = NIL;
 
 	/* For upper relations, scanrel must be either a joinrel or a baserel */
 	Assert(!IS_UPPER_REL(context->foreignrel) ||
@@ -1379,14 +1384,11 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
 	appendStringInfoString(buf, " FROM ");
 	deparseFromExprForRel(buf, context->root, scanrel,
 		  (bms_membership(scanrel->relids) == BMS_MULTIPLE),
-		  (Index) 0, NULL, context->params_list);
-
-	/* Construct WHERE clause */
-	if (quals != NIL)
-	{
-		appendStringInfoString(buf, " WHERE ");
-		appendConditions(quals, context);
-	}
+		  (Index) 0, NULL, _conds,
+		  context->params_list);
+	appendWhereClause(quals, additional_conds, context);
+	if (additional_conds != NIL)
+		list_free_deep(additional_conds);
 }
 
 /*
@@ -1598,6 +1600,42 @@ appendConditions(List *exprs, deparse_expr_cxt *context)
 	reset_transmission_modes(nestlevel);
 }
 
+/*
+ * Append WHERE clause, containing conditions
+ * from exprs and additional_conds, to context->buf.
+ */
+static void
+appendWhereClause(List *exprs, List *additional_conds, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	bool		need_and = false;
+	ListCell   *lc;
+
+	if (exprs != NIL || additional_conds != NIL)
+		appendStringInfoString(buf, " WHERE ");
+
+	/*
+	 * If there are some filters, append them.
+	 */
+	if (exprs != NIL)
+	{
+		appendConditions(exprs, context);
+		need_and = true;
+	}
+
+	/*
+	 * If there are some EXISTS conditions, coming from SEMI-JOINS, append
+	 * them.
+	 */
+	foreach(lc, additional_conds)
+	{
+		if (need_and)
+			appendStringInfoString(buf, " AND ");
+		appendStringInfoString(buf, (char *) lfirst(lc));
+		need_and = true;
+	}
+}
+
 /* Output join name for given join type */
 const char *
 get_jointype_name(JoinType jointype)
@@ -1616,6 +1654,9 @@ get_jointype_name(JoinType jointype)
 		case JOIN_FULL:
 			return "FULL";
 
+		case JOIN_SEMI:
+			return "SEMI";
+
 		default:
 			/* Shouldn't come here, but protect from buggy code. */
 			elog(ERROR, "unsupported join type %d", jointype);
@@ -1712,11 +1753,14 @@ 

Re: Add semi-join pushdown to postgres_fdw

2023-11-26 Thread Alexander Korotkov
Hi, Alexander!

On Tue, Oct 31, 2023 at 1:07 PM Alexander Pyhalov
 wrote:
> There are several cases when we can't push down semi-join in current
> patch.
>
> 1) When target list has attributes from inner relation, which are
> equivalent to some attributes of outer
> relation, we fail to notice this.
>
> 2) When we examine A join B and decide that we can't push it down, this
> decision is final - we state it in fdw_private of joinrel,
> and so if we consider joining these relations in another order, we don't
> reconsider.
> This means that if later examine B join A, we don't try to push it down.
> As semi-join can be executed as JOIN_UNIQUE_INNER or JOIN_UNIQUE_OUTER,
> this can be a problem - we look at some of these paths and remember that
> we can't push down such join.

Thank you for the revision.

I've revised the patch myself.  I've replaced StringInfo with
additional conds into a list of strings as I proposed before.  I think
the code became much clearer.  Also, it gets rid of some unnecessary
allocations.

I think the code itself is not in bad shape.  But patch lacks some
high-level description of semi-joins processing as well as comments on
each manipulation with additional conds.  Could you please add this?

--
Regards,
Alexander Korotkov


0001-postgres_fdw-add-support-for-deparsing-semi-joins-v6.patch
Description: Binary data


Re: Add semi-join pushdown to postgres_fdw

2023-10-31 Thread Alexander Pyhalov

Alexander Korotkov писал 2023-10-30 19:05:

Hi, Alexander!

Thank you for working on this.  I believe this is a very interesting
patch, which significantly improves our FDW-based distributed
facilities.  This is why I decided to review this.



Hi. Thanks for reviewing.


+   /*
+* We can't push down join if its reltarget is not safe
+*/
+   if (!joinrel_target_ok(root, joinrel, jointype, outerrel,
innerrel))
return false;

As I get joinrel_target_ok() function do meaningful checks only for
semi join and always return false for all other kinds of joins.  I
think we should call this only for semi join and name the function
accordingly.


Done.



+   fpinfo->unknown_subquery_rels =
bms_union(fpinfo_o->unknown_subquery_rels,
+
fpinfo_i->unknown_subquery_rels);

Should the comment before this code block be revised?


Updated comment.



+   case JOIN_SEMI:
+   fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
+ fpinfo_i->remote_conds);
+   fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
+  fpinfo->remote_conds);
+   fpinfo->remote_conds = list_copy(fpinfo_o->remote_conds);
+   fpinfo->unknown_subquery_rels =
bms_union(fpinfo->unknown_subquery_rels,
+   innerrel->relids);
+   break;

I think that comment before switch() should be definitely revised.

+ Relids hidden_subquery_rels; /* relids, which can't be referred to
+ * from upper relations */

Could this definition contain the positive part?  Can't be referred to
from upper relations, but used internally for semi joins (or something
like that)?


Made comment a bit more verbose.



Also, I think the machinery around the append_conds could be somewhat
simpler if we turn them into a list (list of strings).  I think that
should make code clearer and also save us some memory allocations.



I've tried to rewrite it as managing lists.. to find out that these are 
not lists.
I mean, in deparseFromExprForRel() we replace lists from both side with 
one condition.
This allows us to preserve conditions hierarchy. We should merge these 
conditions
in the end of IS_JOIN_REL(foreignrel) branch, or we'll push them too 
high. And if we
deparse them in this place as StringInfo, I see no benefit to convert 
them to lists.




In [1] you've referenced the cases, when your patch can't push down
semi-joins.  It doesn't seem impossible to handle these cases, but
that would make the patch much more complicated.  I'm OK to continue
with a simpler patch to handle the majority of cases.  Could you
please add the cases, which can't be pushed down with the current
patch, to the test suite?



There are several cases when we can't push down semi-join in current 
patch.


1) When target list has attributes from inner relation, which are 
equivalent to some attributes of outer

relation, we fail to notice this.

2) When we examine A join B and decide that we can't push it down, this 
decision is final - we state it in fdw_private of joinrel,
and so if we consider joining these relations in another order, we don't 
reconsider.
This means that if later examine B join A, we don't try to push it down. 
As semi-join can be executed as JOIN_UNIQUE_INNER or JOIN_UNIQUE_OUTER,
this can be a problem - we look at some of these paths and remember that 
we can't push down such join.






--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom 91ae85ac735c9f109cd3ab3603693177011e5b94 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Mon, 7 Nov 2022 10:23:32 +0300
Subject: [PATCH] postgres_fdw: add support for deparsing semi joins

We deparse semi-joins as EXISTS subqueries. So, deparsing
semi-join leads to generating addl_conds condition,
which is then added to the uppermost JOIN's WHERE clause.
---
 contrib/postgres_fdw/deparse.c| 206 ---
 .../postgres_fdw/expected/postgres_fdw.out| 320 --
 contrib/postgres_fdw/postgres_fdw.c   |  94 -
 contrib/postgres_fdw/postgres_fdw.h   |   3 +
 contrib/postgres_fdw/sql/postgres_fdw.sql | 126 ++-
 5 files changed, 668 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 09fd489a901..cb0e373055d 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -180,11 +180,14 @@ static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
   RelOptInfo *foreignrel, bool use_alias,
   Index ignore_rel, List **ignore_conds,
+  StringInfo additional_conds,
   List **params_list);
+static void appendWhereClause(List *exprs, StringInfo additional_conds, deparse_expr_cxt *context);
 static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
 static void deparseRangeTblRef(StringInfo buf, 

Re: Add semi-join pushdown to postgres_fdw

2023-10-30 Thread Alexander Korotkov
Hi, Alexander!

Thank you for working on this.  I believe this is a very interesting patch,
which significantly improves our FDW-based distributed facilities.  This is
why I decided to review this.

On Fri, Jan 20, 2023 at 11:00 AM Alexander Pyhalov 
wrote:
> Tomas Vondra писал 2023-01-19 20:49:
> > I took a quick look at the patch. It needs a rebase, although it
> > applies
> > fine using patch.
> >
> > A couple minor comments:
> >
> > 1) addl_conds seems a bit hard to understand, I'd use either the full
> > wording (additional_conds) or maybe extra_conds
>
> Renamed to additional_conds.
>
> >
> > 2) some of the lines got quite long, and need a wrap
> Splitted some of them. Not sure if it's enough.
>
> >
> > 3) unknown_subquery_rels name is a bit misleading - AFAIK it's the rels
> > that can't be referenced from upper rels (per what the .h says). So
> > they
> > are known, but hidden. Is there a better name?
>
> Renamed to hidden_subquery_rels. These are rels, which can't be referred
> to from upper join levels.
>
> >
> > 4) joinrel_target_ok() needs a better comment, explaining *when* the
> > reltarget is safe for pushdown. The conditions are on the same row, but
> > the project style is to break after '&&'.
>
> Added comment. It seems to be a rephrasing of lower comment in
> joinrel_target_ok().

+   /*
+* We can't push down join if its reltarget is not safe
+*/
+   if (!joinrel_target_ok(root, joinrel, jointype, outerrel, innerrel))
return false;

As I get joinrel_target_ok() function do meaningful checks only for semi
join and always return false for all other kinds of joins.  I think we
should call this only for semi join and name the function accordingly.

+   fpinfo->unknown_subquery_rels =
bms_union(fpinfo_o->unknown_subquery_rels,
+
fpinfo_i->unknown_subquery_rels);

Should the comment before this code block be revised?

+   case JOIN_SEMI:
+   fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
+ fpinfo_i->remote_conds);
+   fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
+  fpinfo->remote_conds);
+   fpinfo->remote_conds = list_copy(fpinfo_o->remote_conds);
+   fpinfo->unknown_subquery_rels =
bms_union(fpinfo->unknown_subquery_rels,
+   innerrel->relids);
+   break;

I think that comment before switch() should be definitely revised.

+ Relids hidden_subquery_rels; /* relids, which can't be referred to
+ * from upper relations */

Could this definition contain the positive part?  Can't be referred to from
upper relations, but used internally for semi joins (or something like
that)?

Also, I think the machinery around the append_conds could be somewhat
simpler if we turn them into a list (list of strings).  I think that should
make code clearer and also save us some memory allocations.

In [1] you've referenced the cases, when your patch can't push down
semi-joins.  It doesn't seem impossible to handle these cases, but that
would make the patch much more complicated.  I'm OK to continue with a
simpler patch to handle the majority of cases.  Could you please add the
cases, which can't be pushed down with the current patch, to the test suite?

Links
1.
https://www.postgresql.org/message-id/816fa8b1bc2da09a87484d1ef239a332%40postgrespro.ru

--
Regards,
Alexander Korotkov


Re: Add semi-join pushdown to postgres_fdw

2023-01-20 Thread Alexander Pyhalov

Hi.

Tomas Vondra писал 2023-01-19 20:49:
I took a quick look at the patch. It needs a rebase, although it 
applies

fine using patch.

A couple minor comments:

1) addl_conds seems a bit hard to understand, I'd use either the full
wording (additional_conds) or maybe extra_conds


Renamed to additional_conds.



2) some of the lines got quite long, and need a wrap

Splitted some of them. Not sure if it's enough.



3) unknown_subquery_rels name is a bit misleading - AFAIK it's the rels
that can't be referenced from upper rels (per what the .h says). So 
they

are known, but hidden. Is there a better name?


Renamed to hidden_subquery_rels. These are rels, which can't be referred 
to from upper join levels.




4) joinrel_target_ok() needs a better comment, explaining *when* the
reltarget is safe for pushdown. The conditions are on the same row, but
the project style is to break after '&&'.


Added comment. It seems to be a rephrasing of lower comment in 
joinrel_target_ok().


--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom f37d26d9b622767f94e89034fa8e4fccc69e358d Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Mon, 7 Nov 2022 10:23:32 +0300
Subject: [PATCH v4] postgres_fdw: add support for deparsing semi joins

We deparse semi-joins as EXISTS subqueries. So, deparsing
semi-join leads to generating addl_conds condition,
which is then added to the uppermost JOIN's WHERE clause.
---
 contrib/postgres_fdw/deparse.c| 206 +---
 .../postgres_fdw/expected/postgres_fdw.out| 297 --
 contrib/postgres_fdw/postgres_fdw.c   |  89 +-
 contrib/postgres_fdw/postgres_fdw.h   |   2 +
 contrib/postgres_fdw/sql/postgres_fdw.sql | 119 ++-
 5 files changed, 632 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 473fa45bd43..1217d47050b 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -180,11 +180,14 @@ static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
   RelOptInfo *foreignrel, bool use_alias,
   Index ignore_rel, List **ignore_conds,
+  StringInfo additional_conds,
   List **params_list);
+static void appendWhereClause(List *exprs, StringInfo additional_conds, deparse_expr_cxt *context);
 static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
 static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 			   RelOptInfo *foreignrel, bool make_subquery,
-			   Index ignore_rel, List **ignore_conds, List **params_list);
+			   Index ignore_rel, List **ignore_conds,
+			   StringInfo additional_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
 static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
@@ -1370,23 +1373,21 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	RelOptInfo *scanrel = context->scanrel;
+	StringInfoData additional_conds;
 
 	/* For upper relations, scanrel must be either a joinrel or a baserel */
 	Assert(!IS_UPPER_REL(context->foreignrel) ||
 		   IS_JOIN_REL(scanrel) || IS_SIMPLE_REL(scanrel));
 
+	initStringInfo(_conds);
 	/* Construct FROM clause */
 	appendStringInfoString(buf, " FROM ");
 	deparseFromExprForRel(buf, context->root, scanrel,
 		  (bms_membership(scanrel->relids) == BMS_MULTIPLE),
-		  (Index) 0, NULL, context->params_list);
-
-	/* Construct WHERE clause */
-	if (quals != NIL)
-	{
-		appendStringInfoString(buf, " WHERE ");
-		appendConditions(quals, context);
-	}
+		  (Index) 0, NULL, _conds,
+		  context->params_list);
+	appendWhereClause(quals, _conds, context);
+	pfree(additional_conds.data);
 }
 
 /*
@@ -1598,6 +1599,33 @@ appendConditions(List *exprs, deparse_expr_cxt *context)
 	reset_transmission_modes(nestlevel);
 }
 
+/*
+ * Append WHERE clause, containing conditions
+ * from exprs and additional_conds, to context->buf.
+ */
+static void
+appendWhereClause(List *exprs, StringInfo additional_conds, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	bool		need_and = false;
+
+	if (exprs != NIL || additional_conds->len > 0)
+		appendStringInfoString(buf, " WHERE ");
+
+	if (exprs != NIL)
+	{
+		appendConditions(exprs, context);
+		need_and = true;
+	}
+
+	if (additional_conds->len > 0)
+	{
+		if (need_and)
+			appendStringInfoString(buf, " AND ");
+		appendStringInfo(buf, "(%s)", additional_conds->data);
+	}
+}
+
 /* Output join name for given join type */
 const char *
 get_jointype_name(JoinType jointype)
@@ -1616,6 +1644,9 @@ get_jointype_name(JoinType jointype)
 		case JOIN_FULL:
 			return "FULL";
 
+		case JOIN_SEMI:
+			return "SEMI";
+
 		default:
 			/* Shouldn't come here, but protect from buggy code. */
 			

Re: Add semi-join pushdown to postgres_fdw

2023-01-19 Thread Tomas Vondra
Hi.

I took a quick look at the patch. It needs a rebase, although it applies
fine using patch.

A couple minor comments:

1) addl_conds seems a bit hard to understand, I'd use either the full
wording (additional_conds) or maybe extra_conds

2) some of the lines got quite long, and need a wrap

3) unknown_subquery_rels name is a bit misleading - AFAIK it's the rels
that can't be referenced from upper rels (per what the .h says). So they
are known, but hidden. Is there a better name?

4) joinrel_target_ok() needs a better comment, explaining *when* the
reltarget is safe for pushdown. The conditions are on the same row, but
the project style is to break after '&&'.

Also, I'd write

if (!IsA(var, Var))
continue;

which saves one level of nesting. IMHO that makes it more readable.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




RE: Add semi-join pushdown to postgres_fdw

2022-12-06 Thread fujii.y...@df.mitsubishielectric.co.jp
Hi Mr.Pyhalov.

Thank you for fixing it and giving more explanation.

> IIRC, planner can create semi-join, which targetlist references Vars 
> from inner join relation. However, it's deparsed as exists and so we 
> can't reference it from SQL. So, there's this check - if Var is 
> referenced in semi-join target list, it can't be pushed down.
> You can see this if comment out this check.
> 
> EXPLAIN (verbose, costs off)
>  SELECT ft2.*, ft4.* FROM ft2 INNER JOIN
>  (SELECT * FROM ft4 WHERE EXISTS (
>  SELECT 1 FROM ft2 WHERE ft2.c2=ft4.c2)) ft4
>  ON ft2.c2 = ft4.c1
>  INNER JOIN
>  (SELECT * FROM ft2 WHERE EXISTS (
>  SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)) ft21
>  ON ft2.c2 = ft21.c2
>  WHERE ft2.c1 > 900
>  ORDER BY ft2.c1 LIMIT 10;
> 
> will fail with
> EXPLAIN SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT 
> NULL FROM "S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2
> 
> Here you can see that
> SELECT * FROM ft2 WHERE EXISTS (
>  SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)
> 
> was transformed to
> SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT NULL 
> FROM "S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2
> 
> where our exists subquery is referenced from tlist. It's fine for plan 
> (relations, participating in semi-join, can be referenced in tlist), 
> but is not going to work with EXISTS subquery.
> BTW, there's a comment in joinrel_target_ok(). It tells exactly that -
> 
> 5535 if (jointype == JOIN_SEMI &&
> bms_is_member(var->varno,
> innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))
> 5536 {
> 5537 /* We deparse semi-join as exists() subquery, and
> so can't deparse references to inner rel in join target list. */
> 5538 ok = false;
> 5539 break;
> 5540 }
> 
> Expanded comment.
Thank you for expanding your comment and giving examples. 
Thanks to the above examples, I understood in what case planner wolud create 
semi-join, 
which targetlist references Vars from inner join relation.

> > question2) In foreign_join_ok
> >   > * Constructing queries representing ANTI joins is hard, hence
> >   Is this true? Is it hard to expand your approach to ANTI join 
> > pushdown?
> 
> I haven't tried, so don't know.
I understand the situation.

> The naming means additional conditions (for WHERE clause, by analogy 
> with ignore_conds and remote_conds). Not sure if subquery_expr sounds 
> better, but if you come with better idea, I'm fine with renaming them.
Sure.

> > question4) Although really detail, there is expression making space 
> > such as
> >   "ft4.c2 = ft2.c2" and one making no space such as "c1=ftupper.c1".
> >   Is there reason for this difference? If not, need we use same 
> > policy for making space?
Thank you.

Later, I'm going to look at other part of your patch.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation


Re: Add semi-join pushdown to postgres_fdw

2022-12-06 Thread Alexander Pyhalov

Hi, Yuki.

Thanks for looking at this patch.

fujii.y...@df.mitsubishielectric.co.jp писал 2022-12-03 06:02:


question1)
  > + if (jointype == JOIN_SEMI && bms_is_member(var->varno,
innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))
  It takes time for me to find in what case this condition is true.
  There is cases in which this condition is true for semi-join of two 
baserels

  when running query which joins more than two relations such as
query2 and query3.
  Running queries such as query2, you maybe want to pushdown of only
semi-join path of
  joinrel(outerrel) defined by (f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1)
and baserel(innerrel) f_t3
  because of safety deparse. So you add this condition.
  Becouase of this limitation, your patch can't push down subquery 
expression

  "exists (select null from f_t2 where c1 = a1.c1)" in query3.
  I think, it is one of difficulty points for semi-join pushdown.
  This is my understanding of the intent of this condition and the
restrictions imposed by this condition.
  Is my understanding right?


IIRC, planner can create semi-join, which targetlist references Vars 
from inner join relation. However, it's deparsed as exists and so we 
can't reference it from SQL. So, there's this check - if Var is 
referenced in semi-join target list, it can't be pushed down.

You can see this if comment out this check.

EXPLAIN (verbose, costs off)
SELECT ft2.*, ft4.* FROM ft2 INNER JOIN
(SELECT * FROM ft4 WHERE EXISTS (
SELECT 1 FROM ft2 WHERE ft2.c2=ft4.c2)) ft4
ON ft2.c2 = ft4.c1
INNER JOIN
(SELECT * FROM ft2 WHERE EXISTS (
SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)) ft21
ON ft2.c2 = ft21.c2
WHERE ft2.c1 > 900
ORDER BY ft2.c1 LIMIT 10;

will fail with
EXPLAIN SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT 
NULL FROM "S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2


Here you can see that
SELECT * FROM ft2 WHERE EXISTS (
SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)

was transformed to
SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT NULL FROM 
"S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2


where our exists subquery is referenced from tlist. It's fine for plan 
(relations, participating in semi-join, can be referenced in tlist),

but is not going to work with EXISTS subquery.
BTW, there's a comment in joinrel_target_ok(). It tells exactly that -

5535 if (jointype == JOIN_SEMI && bms_is_member(var->varno, 
innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))

5536 {
5537 /* We deparse semi-join as exists() subquery, and 
so can't deparse references to inner rel in join target list. */

5538 ok = false;
5539 break;
5540 }

Expanded comment.


question2) In foreign_join_ok
  > * Constructing queries representing ANTI joins is hard, hence
  Is this true? Is it hard to expand your approach to ANTI join 
pushdown?


I haven't tried, so don't know.

question3) You use variables whose name is "addl_condXXX" in the 
following code.

  > appendStringInfo(addl_conds, "EXISTS (SELECT NULL FROM %s",
join_sql_i.data);
  Does this naming mean additional literal?
  Is there more complehensive naming, such as "subquery_exprXXX"?


The naming means additional conditions (for WHERE clause, by analogy 
with ignore_conds and remote_conds). Not sure if subquery_expr sounds 
better, but if you come with better idea, I'm fine with renaming them.


question4) Although really detail, there is expression making space 
such as

  "ft4.c2 = ft2.c2" and one making no space such as "c1=ftupper.c1".
  Is there reason for this difference? If not, need we use same policy
for making space?



Fixed.

--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom 0f26b0841cb095b4e114984deac2b1b001368c15 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Mon, 7 Nov 2022 10:23:32 +0300
Subject: [PATCH v3] postgres_fdw: add support for deparsing semi joins

We deparse semi-joins as EXISTS subqueries. So, deparsing
semi-join leads to generating addl_conds condition,
which is then added to the uppermost JOIN's WHERE clause.
---
 contrib/postgres_fdw/deparse.c| 201 +---
 .../postgres_fdw/expected/postgres_fdw.out| 297 --
 contrib/postgres_fdw/postgres_fdw.c   |  82 -
 contrib/postgres_fdw/postgres_fdw.h   |   3 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql | 119 ++-
 5 files changed, 620 insertions(+), 82 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 95247656504..10d82d9f2ab 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -179,12 +179,13 @@ static void appendLimitClause(deparse_expr_cxt *context);
 static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
   RelOptInfo *foreignrel, bool use_alias,

RE: Add semi-join pushdown to postgres_fdw

2022-12-02 Thread fujii.y...@df.mitsubishielectric.co.jp
Hi Mr.Pyhalov.

Thank you for work on this useful patch.
I'm starting to review v2 patch.
I have cheked we can apply v2 patch to commit 
ec386948948c1708c0c28c48ef08b9c4dd9d47cc
(Date:Thu Dec 1 12:56:21 2022 +0100).
I briefly looked at this whole thing and did step execute this
by running simple queries such as the followings.

query1) select * from f_t1 a1 where a1.c1 in (select c1 from f_t2);
query2) select * from f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1 where a1.c1 in 
(select c1 from f_t3) ;
query3) update f_t2 set c1 = 1 from f_t1 a1 where a1.c2 = f_t2.c2 and exists 
(select null from f_t2 where c1 = a1.c1);

Although I haven't seen all of v2 patch, for now I have the following questions.

question1) 
  > + if (jointype == JOIN_SEMI && bms_is_member(var->varno, innerrel->relids) 
&& !bms_is_member(var->varno, outerrel->relids))
  It takes time for me to find in what case this condition is true.
  There is cases in which this condition is true for semi-join of two baserels 
  when running query which joins more than two relations such as query2 and 
query3.
  Running queries such as query2, you maybe want to pushdown of only semi-join 
path of 
  joinrel(outerrel) defined by (f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1) and 
baserel(innerrel) f_t3 
  because of safety deparse. So you add this condition.
  Becouase of this limitation, your patch can't push down subquery expression 
  "exists (select null from f_t2 where c1 = a1.c1)" in query3.
  I think, it is one of difficulty points for semi-join pushdown.
  This is my understanding of the intent of this condition and the restrictions 
imposed by this condition.
  Is my understanding right?
  I think if there are comments for the intent of this condition and the 
restrictions imposed by this condition  
  then they help PostgreSQL developper. What do you think?

question2) In foreign_join_ok
  > * Constructing queries representing ANTI joins is hard, hence
  Is this true? Is it hard to expand your approach to ANTI join pushdown?

question3) You use variables whose name is "addl_condXXX" in the following code.
  > appendStringInfo(addl_conds, "EXISTS (SELECT NULL FROM %s", 
join_sql_i.data);
  Does this naming mean additional literal?
  Is there more complehensive naming, such as "subquery_exprXXX"?

question4) Although really detail, there is expression making space such as
  "ft4.c2 = ft2.c2" and one making no space such as "c1=ftupper.c1".
  Is there reason for this difference? If not, need we use same policy for 
making space?

Later, I'm going to look at part of your patch which is used when running more 
complex query.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation


Re: Add semi-join pushdown to postgres_fdw

2022-11-06 Thread Alexander Pyhalov

Ian Lawrence Barwick писал 2022-11-04 02:21:


This entry was marked as "Needs review" in the CommitFest app but cfbot
reports the patch no longer applies.

We've marked it as "Waiting on Author". As CommitFest 2022-11 is
currently underway, this would be an excellent time update the patch.

Once you think the patchset is ready for review again, you (or any
interested party) can  move the patch entry forward by visiting

https://commitfest.postgresql.org/40/3838/

and changing the status to "Needs review".



Hi. I've rebased the patch.
--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom 1ac2a9e3611f716da688c04a4ec36888f62078ce Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Mon, 7 Nov 2022 10:23:32 +0300
Subject: [PATCH] postgres_fdw: add support for deparsing semi joins

We deparse semi-joins as EXISTS subqueries. So, deparsing
semi-join leads to generating addl_conds condition,
which is then added to the uppermost JOIN's WHERE clause.
---
 contrib/postgres_fdw/deparse.c| 198 +---
 .../postgres_fdw/expected/postgres_fdw.out| 297 --
 contrib/postgres_fdw/postgres_fdw.c   |  78 -
 contrib/postgres_fdw/postgres_fdw.h   |   3 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql | 119 ++-
 5 files changed, 613 insertions(+), 82 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 95247656504..45885442418 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -179,12 +179,13 @@ static void appendLimitClause(deparse_expr_cxt *context);
 static void appendConditions(List *exprs, deparse_expr_cxt *context);
 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root,
   RelOptInfo *foreignrel, bool use_alias,
-  Index ignore_rel, List **ignore_conds,
+  Index ignore_rel, List **ignore_conds, StringInfo addl_conds,
   List **params_list);
+static void appendWhereClause(List *exprs, StringInfo addl_conds, deparse_expr_cxt *context);
 static void deparseFromExpr(List *quals, deparse_expr_cxt *context);
 static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root,
 			   RelOptInfo *foreignrel, bool make_subquery,
-			   Index ignore_rel, List **ignore_conds, List **params_list);
+			   Index ignore_rel, List **ignore_conds, StringInfo addl_conds, List **params_list);
 static void deparseAggref(Aggref *node, deparse_expr_cxt *context);
 static void appendGroupByClause(List *tlist, deparse_expr_cxt *context);
 static void appendOrderBySuffix(Oid sortop, Oid sortcoltype, bool nulls_first,
@@ -1370,23 +1371,20 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context)
 {
 	StringInfo	buf = context->buf;
 	RelOptInfo *scanrel = context->scanrel;
+	StringInfoData addl_conds;
 
 	/* For upper relations, scanrel must be either a joinrel or a baserel */
 	Assert(!IS_UPPER_REL(context->foreignrel) ||
 		   IS_JOIN_REL(scanrel) || IS_SIMPLE_REL(scanrel));
 
+	initStringInfo(_conds);
 	/* Construct FROM clause */
 	appendStringInfoString(buf, " FROM ");
 	deparseFromExprForRel(buf, context->root, scanrel,
 		  (bms_membership(scanrel->relids) == BMS_MULTIPLE),
-		  (Index) 0, NULL, context->params_list);
-
-	/* Construct WHERE clause */
-	if (quals != NIL)
-	{
-		appendStringInfoString(buf, " WHERE ");
-		appendConditions(quals, context);
-	}
+		  (Index) 0, NULL, _conds, context->params_list);
+	appendWhereClause(quals, _conds, context);
+	pfree(addl_conds.data);
 }
 
 /*
@@ -1598,6 +1596,33 @@ appendConditions(List *exprs, deparse_expr_cxt *context)
 	reset_transmission_modes(nestlevel);
 }
 
+/*
+ * Append WHERE clause, containing conditions
+ * from exprs and addl_conds, to context->buf.
+ */
+static void
+appendWhereClause(List *exprs, StringInfo addl_conds, deparse_expr_cxt *context)
+{
+	StringInfo	buf = context->buf;
+	bool		need_and = false;
+
+	if (exprs != NIL || addl_conds->len > 0)
+		appendStringInfoString(buf, " WHERE ");
+
+	if (exprs != NIL)
+	{
+		appendConditions(exprs, context);
+		need_and = true;
+	}
+
+	if (addl_conds->len > 0)
+	{
+		if (need_and)
+			appendStringInfoString(buf, " AND ");
+		appendStringInfo(buf, "(%s)", addl_conds->data);
+	}
+}
+
 /* Output join name for given join type */
 const char *
 get_jointype_name(JoinType jointype)
@@ -1616,6 +1641,9 @@ get_jointype_name(JoinType jointype)
 		case JOIN_FULL:
 			return "FULL";
 
+		case JOIN_SEMI:
+			return "SEMI";
+
 		default:
 			/* Shouldn't come here, but protect from buggy code. */
 			elog(ERROR, "unsupported join type %d", jointype);
@@ -1715,7 +1743,7 @@ deparseSubqueryTargetList(deparse_expr_cxt *context)
  */
 static void
 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel,
-	  bool use_alias, Index ignore_rel, List **ignore_conds,
+	  bool use_alias, Index ignore_rel, List **ignore_conds, StringInfo addl_conds,
 	  List **params_list)
 {
 	PgFdwRelationInfo 

Re: Add semi-join pushdown to postgres_fdw

2022-11-03 Thread Ian Lawrence Barwick
2022年8月30日(火) 15:58 Alexander Pyhalov :
>
> Ashutosh Bapat писал 2022-08-29 17:12:
> > Hi Alexander,
> > Thanks for working on this. It's great to see FDW join pushdown scope
> > being expanded to more complex cases.
> >
> > I am still figuring out the implementation. It's been a while I have
> > looked at join push down code.
> >
> > But following change strikes me odd
> >  -- subquery using immutable function (can be sent to remote)
> >  PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
> > IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
> > '1970-01-17'::date) ORDER BY c1;
> >  EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
> > -  QUERY PLAN
> > 
> > - Sort
> > +
> >
> > QUERY PLAN
> > +---
> > + Foreign Scan
> > Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> > -   Sort Key: t1.c1
> > -   ->  Nested Loop Semi Join
> > - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7,
> > t1.c8
> > - Join Filter: (t1.c3 = t2.c3)
> > - ->  Foreign Scan on public.ft1 t1
> > -   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6,
> > t1.c7, t1.c8
> > -   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
> > FROM "S 1"."T 1" WHERE (("C 1" < 20))
> > - ->  Materialize
> > -   Output: t2.c3
> > -   ->  Foreign Scan on public.ft2 t2
> > - Output: t2.c3
> > - Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE
> > (("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
> > -(14 rows)
> > +   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
> > +   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
> > r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS
> > (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
> > ((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3 ORDER BY
> > r1."C 1" ASC NULLS LAST
> > +(4 rows)
> >
> > date_in  | s   |1 | [0:0]={cstring}
> > date_in which will be used to cast a test to date is not immutable. So
> > the query should't be pushed down. May not be a problem with your
> > patch. Can you please check?
>
> Hi.
>
> It is not related to my change and works as expected. As I see, we have
> expression FuncExprdate(oid = 2029, args=Var ) = Const(type date)
> (date(r3.c5) = '1970-01-17'::date).
> Function is
>
> # select proname, provolatile from pg_proc where oid=2029;
>   proname | provolatile
> -+-
>   date| i
>
> So it's shippable.

This entry was marked as "Needs review" in the CommitFest app but cfbot
reports the patch no longer applies.

We've marked it as "Waiting on Author". As CommitFest 2022-11 is
currently underway, this would be an excellent time update the patch.

Once you think the patchset is ready for review again, you (or any
interested party) can  move the patch entry forward by visiting

https://commitfest.postgresql.org/40/3838/

and changing the status to "Needs review".


Thanks

Ian Barwick




Re: Add semi-join pushdown to postgres_fdw

2022-08-30 Thread Alexander Pyhalov

Ashutosh Bapat писал 2022-08-29 17:12:

Hi Alexander,
Thanks for working on this. It's great to see FDW join pushdown scope
being expanded to more complex cases.

I am still figuring out the implementation. It's been a while I have
looked at join push down code.

But following change strikes me odd
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
'1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-  QUERY PLAN

- Sort
+

QUERY PLAN
+---
+ Foreign Scan
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
- Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, 
t1.c8

- Join Filter: (t1.c3 = t2.c3)
- ->  Foreign Scan on public.ft1 t1
-   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, 
t1.c7, t1.c8

-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
- ->  Materialize
-   Output: t2.c3
-   ->  Foreign Scan on public.ft2 t2
- Output: t2.c3
- Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE
(("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
-(14 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3 ORDER BY
r1."C 1" ASC NULLS LAST
+(4 rows)

date_in  | s   |1 | [0:0]={cstring}
date_in which will be used to cast a test to date is not immutable. So
the query should't be pushed down. May not be a problem with your
patch. Can you please check?


Hi.

It is not related to my change and works as expected. As I see, we have 
expression FuncExprdate(oid = 2029, args=Var ) = Const(type date)

(date(r3.c5) = '1970-01-17'::date).
Function is

# select proname, provolatile from pg_proc where oid=2029;
 proname | provolatile
-+-
 date| i

So it's shippable.
--
Best regards,
Alexander Pyhalov,
Postgres Professional




Re: Add semi-join pushdown to postgres_fdw

2022-08-29 Thread Ashutosh Bapat
Hi Alexander,
Thanks for working on this. It's great to see FDW join pushdown scope
being expanded to more complex cases.

I am still figuring out the implementation. It's been a while I have
looked at join push down code.

But following change strikes me odd
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
'1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-  QUERY PLAN

- Sort
+

QUERY PLAN
+---
+ Foreign Scan
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
- Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Join Filter: (t1.c3 = t2.c3)
- ->  Foreign Scan on public.ft1 t1
-   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
- ->  Materialize
-   Output: t2.c3
-   ->  Foreign Scan on public.ft2 t2
- Output: t2.c3
- Remote SQL: SELECT c3 FROM "S 1"."T 1" WHERE
(("C 1" > 10)) AND ((date(c5) = '1970-01-17'::date))
-(14 rows)
+   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND (EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r1.c3 = r3.c3 ORDER BY
r1."C 1" ASC NULLS LAST
+(4 rows)

date_in  | s   |1 | [0:0]={cstring}
date_in which will be used to cast a test to date is not immutable. So
the query should't be pushed down. May not be a problem with your
patch. Can you please check?

On Wed, Aug 24, 2022 at 12:55 PM Alexander Pyhalov
 wrote:
>
> Hi.
>
> It's possible to extend deparsing in postgres_fdw, so that we can push
> down semi-joins, which doesn't refer to inner reltarget. This allows
> us to push down joins in queries like
>
> SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND t1.c3 IN (SELECT c3 FROM ft2
> t2 WHERE date(c5) = '1970-01-17'::date);
>
>
> EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 < 10 AND
> t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE date(c5) = '1970-01-17'::date);
>
>  QUERY PLAN
> ---
> Foreign Scan
> Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
> Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
> Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
> r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 10)) AND (EXISTS
> (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((date(r3.c5) =
> '1970-01-17'::date)) AND ((r1.c3 = r3.c3
>

Thanks for working on this. It's great to see FDW join pushdown scope
being expanded to more complex cases.

I am still figuring out the implementation. It's been a while I have
looked at join push down code.

But following change strikes me odd
 -- subquery using immutable function (can be sent to remote)
 PREPARE st3(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3
IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c5) =
'1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st3(10, 20);
-  QUERY PLAN

- Sort
+

QUERY PLAN
+---
+ Foreign Scan
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
- Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- Join Filter: (t1.c3 = t2.c3)
- ->  Foreign Scan on public.ft1 t1
-   Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Remote SQL: SELECT "C