Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On Tue, Mar 29, 2016 at 10:20 AM, Ashutosh Bapatwrote: >> I think the reason for that is in foreign_join_ok. This in that function: >> >> wrongly pulls up remote_conds from joining relations in the FULL JOIN >> case. I think we should not pull up such conditions in the FULL JOIN case. >> > > Right. For a full outer join, since each joining relation acts as outer for > the other, we can not pull up the quals to either join clauses or other > clauses. So, in such a case, we will need to encapsulate the joining > relation with conditions into a subquery. Unfortunately, the current deparse > logic does not handle this encapsulation. Adding that functionality so close > to the feature freeze might be risky given the amount of code changes > required. > > PFA patch with a quick fix. A full outer join with either of the joining > relations having WHERE conditions (or other clauses) is not pushed down. In > the particular case that was reported, the bug triggered because of the way > conditions are handled for an inner join. For an inner join, all the > conditions in ON as well as WHERE clause are treated like they are part of > WHERE clause. This allows pushing down a join even if there are unpushable > join clauses. But the pushable conditions can be put back into the ON > clause. This avoids using subqueries while deparsing. Committed. I think we should introduce subquery-based deparsing for 9.7, but I agree it's better not to do it now. I think we should try to handle SEMI and ANTI joins that way, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On 2016/04/14 15:20, Ashutosh Bapat wrote: On Thu, Apr 14, 2016 at 8:42 AM, Etsuro Fujita> wrote: As you mentioned, we could support FULL JOIN fully, by encapsulating a joining relation with conditions into a subquery. And ISTM that it is relatively easy to do that, by borrowing some ideas from Hanada-san's original join pushdown patch. If it's OK, I'll create a patch for that in a few days. In his patch the deparsing targetlist and conditions required that the FROM clause entries were ready with the columns from base relations and joins realiased. That's no more true. We deparse every Var node as . where relation alias is nothing but rN; N being index of RangeTblEntry. So, Hanada-san's method to deparse recursively can not be applied as such now. I think so, too. I don't think his ideas could be applied as is. Here's what needs to be done: When we identify that certain relation (base or join) needs a subquery to be deparsed (because the join relation above it could not pull the quals up), we remember it in the upper join relation. Before deparsing 1. we walk the join tree and collect targetlists of all such relations, 2. associate column aliases with those targetlists (save the column alises in resname?) and craft a relation alias 3. associate the relations alias, column aliases and targetlists with the base relations involved in such relations (may be creating a list similar to rtable). While deparsing a Var node, we check if corresponding base relation is itself or part of a relation deparsed as a subquery. If it is then we lookup that Var in the targetlist associated with the base relation and use corresponding relation and column alias for deparsing it. Otherwise, deparse it as . usually. Good to know. That is what I have in mind, except for the way of collecting subqueries' columns and associating those columns with relation and column aliases, which I think can be done more easily. Please find attached a WIP patch. That patch works well at least for queries in your patch. Maybe I'm missing something, though. That looks like a big code change to go after feature freeze. So, we will leave it for 9.7, unless RMT allows us to introduce that change. OK Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 88,93 typedef struct foreign_loc_cxt --- 88,105 } foreign_loc_cxt; /* + * Structure for information on subqueries' column aliases + */ + typedef struct ColumnAliases + { + List *exprs; /* subqueries' columns */ + int max_exprs; /* maximum number of columns stored */ + int *ssno; /* table alias numbers of columns */ + int *sscolno; /* column alias numbers of columns */ + int next_ssno; /* next subquery's table alias number */ + } ColumnAliases; + + /* * Context for deparseExpr */ typedef struct deparse_expr_cxt *** *** 96,101 typedef struct deparse_expr_cxt --- 108,114 RelOptInfo *foreignrel; /* the foreign relation we are planning for */ StringInfo buf; /* output buffer to append to */ List **params_list; /* exprs that will become remote Params */ + ColumnAliases *colaliases; /* subqueries' column aliases */ } deparse_expr_cxt; #define REL_ALIAS_PREFIX "r" *** *** 103,108 typedef struct deparse_expr_cxt --- 116,124 #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_ALIAS_PREFIX "ss" + #define SSCOL_ALIAS_PREFIX "c" + /* * Functions to determine whether an expression can be evaluated safely on * remote server. *** *** 152,164 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); /* --- 168,188 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, ! List *remote_conds, ! List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context);
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On Thu, Apr 14, 2016 at 8:42 AM, Etsuro Fujitawrote: > On 2016/03/29 23:20, Ashutosh Bapat wrote: > >> I think the reason for that is in foreign_join_ok. This in that >> function: >> >> wrongly pulls up remote_conds from joining relations in the FULL >> JOIN case. I think we should not pull up such conditions in the >> FULL JOIN case. >> > > Right. For a full outer join, since each joining relation acts as outer >> for the other, we can not pull up the quals to either join clauses or >> other clauses. So, in such a case, we will need to encapsulate the >> joining relation with conditions into a subquery. Unfortunately, the >> current deparse logic does not handle this encapsulation. Adding that >> functionality so close to the feature freeze might be risky given the >> amount of code changes required. >> >> PFA patch with a quick fix. A full outer join with either of the joining >> relations having WHERE conditions (or other clauses) is not pushed down. >> In the particular case that was reported, the bug triggered because of >> the way conditions are handled for an inner join. For an inner join, all >> the conditions in ON as well as WHERE clause are treated like they are >> part of WHERE clause. This allows pushing down a join even if there are >> unpushable join clauses. But the pushable conditions can be put back >> into the ON clause. This avoids using subqueries while deparsing. >> > > I'm not sure that is a good idea. As you mentioned, we could support FULL > JOIN fully, by encapsulating a joining relation with conditions into a > subquery. And ISTM that it is relatively easy to do that, by borrowing > some ideas from Hanada-san's original join pushdown patch. If it's OK, > I'll create a patch for that in a few days. > In his patch the deparsing targetlist and conditions required that the FROM clause entries were ready with the columns from base relations and joins realiased. That's no more true. We deparse every Var node as . where relation alias is nothing but rN; N being index of RangeTblEntry. So, Hanada-san's method to deparse recursively can not be applied as such now. Here's what needs to be done: When we identify that certain relation (base or join) needs a subquery to be deparsed (because the join relation above it could not pull the quals up), we remember it in the upper join relation. Before deparsing 1. we walk the join tree and collect targetlists of all such relations, 2. associate column aliases with those targetlists (save the column alises in resname?) and craft a relation alias 3. associate the relations alias, column aliases and targetlists with the base relations involved in such relations (may be creating a list similar to rtable). While deparsing a Var node, we check if corresponding base relation is itself or part of a relation deparsed as a subquery. If it is then we lookup that Var in the targetlist associated with the base relation and use corresponding relation and column alias for deparsing it. Otherwise, deparse it as . usually. That looks like a big code change to go after feature freeze. So, we will leave it for 9.7, unless RMT allows us to introduce that change. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On 2016/03/29 23:20, Ashutosh Bapat wrote: I think the reason for that is in foreign_join_ok. This in that function: wrongly pulls up remote_conds from joining relations in the FULL JOIN case. I think we should not pull up such conditions in the FULL JOIN case. Right. For a full outer join, since each joining relation acts as outer for the other, we can not pull up the quals to either join clauses or other clauses. So, in such a case, we will need to encapsulate the joining relation with conditions into a subquery. Unfortunately, the current deparse logic does not handle this encapsulation. Adding that functionality so close to the feature freeze might be risky given the amount of code changes required. PFA patch with a quick fix. A full outer join with either of the joining relations having WHERE conditions (or other clauses) is not pushed down. In the particular case that was reported, the bug triggered because of the way conditions are handled for an inner join. For an inner join, all the conditions in ON as well as WHERE clause are treated like they are part of WHERE clause. This allows pushing down a join even if there are unpushable join clauses. But the pushable conditions can be put back into the ON clause. This avoids using subqueries while deparsing. I'm not sure that is a good idea. As you mentioned, we could support FULL JOIN fully, by encapsulating a joining relation with conditions into a subquery. And ISTM that it is relatively easy to do that, by borrowing some ideas from Hanada-san's original join pushdown patch. If it's OK, I'll create a patch for that in a few days. Sorry for speaking up late. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
Thanks Ashutosh for the patch. I have applied and tested it. Now getting proper result for reported issue. Thanks & Regards, Rajkumar Raghuwanshi QMG, EnterpriseDB Corporation On Tue, Mar 29, 2016 at 7:50 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > > Observation:_ Inner join and full outer join combination on a table >>> >>> generating wrong result. >>> >>> SELECT * FROM lt; >>> c1 >>> >>>1 >>>2 >>> (2 rows) >>> >>> SELECT * FROM ft; >>> c1 >>> >>>1 >>>2 >>> (2 rows) >>> >>> \d+ ft >>> Foreign table "public.ft" >>> Column | Type | Modifiers | FDW Options | Storage | Stats target | >>> Description >>> >>> +-+---+-+-+--+- >>> c1 | integer | | | plain | | >>> Server: link_server >>> FDW Options: (table_name 'lt') >>> >>> --inner join and full outer join on local tables >>> SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) >>> FULL JOIN lt t3 ON (t2.c1 = t3.c1); >>> c1 | c1 | c1 >>> ++ >>>1 | 1 | 1 >>>2 | 2 | 2 >>> (2 rows) >>> >>> --inner join and full outer join on corresponding foreign tables >>> SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) >>> FULL JOIN ft t3 ON (t2.c1 = t3.c1); >>> c1 | c1 | c1 >>> ++ >>>1 | 1 | 1 >>>1 | 2 | >>>2 | 1 | >>>2 | 2 | 2 >>> (4 rows) >>> >> > Thanks Rajkumar for the detailed report. > > >> >> I think the reason for that is in foreign_join_ok. This in that function: >> >> wrongly pulls up remote_conds from joining relations in the FULL JOIN >> case. I think we should not pull up such conditions in the FULL JOIN case. >> >> > Right. For a full outer join, since each joining relation acts as outer > for the other, we can not pull up the quals to either join clauses or other > clauses. So, in such a case, we will need to encapsulate the joining > relation with conditions into a subquery. Unfortunately, the current > deparse logic does not handle this encapsulation. Adding that functionality > so close to the feature freeze might be risky given the amount of code > changes required. > > PFA patch with a quick fix. A full outer join with either of the joining > relations having WHERE conditions (or other clauses) is not pushed down. In > the particular case that was reported, the bug triggered because of the way > conditions are handled for an inner join. For an inner join, all the > conditions in ON as well as WHERE clause are treated like they are part of > WHERE clause. This allows pushing down a join even if there are unpushable > join clauses. But the pushable conditions can be put back into the ON > clause. This avoids using subqueries while deparsing. > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company >
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
> Observation:_ Inner join and full outer join combination on a table >> >> generating wrong result. >> >> SELECT * FROM lt; >> c1 >> >>1 >>2 >> (2 rows) >> >> SELECT * FROM ft; >> c1 >> >>1 >>2 >> (2 rows) >> >> \d+ ft >> Foreign table "public.ft" >> Column | Type | Modifiers | FDW Options | Storage | Stats target | >> Description >> >> +-+---+-+-+--+- >> c1 | integer | | | plain | | >> Server: link_server >> FDW Options: (table_name 'lt') >> >> --inner join and full outer join on local tables >> SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) >> FULL JOIN lt t3 ON (t2.c1 = t3.c1); >> c1 | c1 | c1 >> ++ >>1 | 1 | 1 >>2 | 2 | 2 >> (2 rows) >> >> --inner join and full outer join on corresponding foreign tables >> SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) >> FULL JOIN ft t3 ON (t2.c1 = t3.c1); >> c1 | c1 | c1 >> ++ >>1 | 1 | 1 >>1 | 2 | >>2 | 1 | >>2 | 2 | 2 >> (4 rows) >> > Thanks Rajkumar for the detailed report. > > I think the reason for that is in foreign_join_ok. This in that function: > > wrongly pulls up remote_conds from joining relations in the FULL JOIN > case. I think we should not pull up such conditions in the FULL JOIN case. > > Right. For a full outer join, since each joining relation acts as outer for the other, we can not pull up the quals to either join clauses or other clauses. So, in such a case, we will need to encapsulate the joining relation with conditions into a subquery. Unfortunately, the current deparse logic does not handle this encapsulation. Adding that functionality so close to the feature freeze might be risky given the amount of code changes required. PFA patch with a quick fix. A full outer join with either of the joining relations having WHERE conditions (or other clauses) is not pushed down. In the particular case that was reported, the bug triggered because of the way conditions are handled for an inner join. For an inner join, all the conditions in ON as well as WHERE clause are treated like they are part of WHERE clause. This allows pushing down a join even if there are unpushable join clauses. But the pushable conditions can be put back into the ON clause. This avoids using subqueries while deparsing. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 50f1261..5c4ebb6 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -441,31 +441,31 @@ SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C 1" 107 | 107 108 | 108 109 | 109 110 | 110 (10 rows) -- A join between local table and foreign join. ORDER BY clause is added to the -- foreign join so that the local table can be joined using merge join strategy. EXPLAIN (COSTS false, VERBOSE) SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10; - QUERY PLAN - + QUERY PLAN +- Limit Output: t1."C 1" -> Merge Right Join Output: t1."C 1" Merge Cond: (t3.c1 = t1."C 1") -> Foreign Scan Output: t3.c1 Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3) - Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (TRUE)) WHERE ((r2."C 1" = r3."C 1")) ORDER BY r2."C 1" ASC NULLS LAST + Remote SQL: SELECT r3."C 1" FROM ("S 1"."T 1" r2 INNER JOIN "S 1"."T 1" r3 ON (((r2."C 1" = r3."C 1" ORDER BY r2."C 1" ASC NULLS LAST -> Index Only Scan using t1_pkey on "S 1"."T 1" t1 Output: t1."C 1" (11 rows) SELECT t1."C 1" FROM "S 1"."T 1" t1 left join ft1 t2 join ft2 t3 on (t2.c1 = t3.c1) on (t3.c1 = t1."C 1") OFFSET 100 LIMIT 10; C 1 - 101 102 103 @@ -896,59 +896,59 @@ SELECT count(c3) FROM ft1 t1 WHERE t1.c1 === t1.c2; -- === -- JOIN queries --
Re: [HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
On 2016/03/28 18:17, Rajkumar Raghuwanshi wrote: I am testing postgres_fdw join pushdown feature for PostgreSQL 9.6 DB, and I observed below issue._ Observation:_ Inner join and full outer join combination on a table generating wrong result. SELECT * FROM lt; c1 1 2 (2 rows) SELECT * FROM ft; c1 1 2 (2 rows) \d+ ft Foreign table "public.ft" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description +-+---+-+-+--+- c1 | integer | | | plain | | Server: link_server FDW Options: (table_name 'lt') --inner join and full outer join on local tables SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) FULL JOIN lt t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 2 | 2 | 2 (2 rows) --inner join and full outer join on corresponding foreign tables SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) FULL JOIN ft t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 1 | 2 | 2 | 1 | 2 | 2 | 2 (4 rows) I think the reason for that is in foreign_join_ok. This in that function: switch (jointype) { case JOIN_INNER: fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_i->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_o->remote_conds)); break; case JOIN_LEFT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_i->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_o->remote_conds)); break; case JOIN_RIGHT: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_o->remote_conds)); fpinfo->remote_conds = list_concat(fpinfo->remote_conds, list_copy(fpinfo_i->remote_conds)); break; case JOIN_FULL: fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_i->remote_conds)); fpinfo->joinclauses = list_concat(fpinfo->joinclauses, list_copy(fpinfo_o->remote_conds)); break; default: /* Should not happen, we have just check this above */ elog(ERROR, "unsupported join type %d", jointype); } wrongly pulls up remote_conds from joining relations in the FULL JOIN case. I think we should not pull up such conditions in the FULL JOIN case. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Postgres_fdw join pushdown - INNER - FULL OUTER join combination generating wrong result
Hi, I am testing postgres_fdw join pushdown feature for PostgreSQL 9.6 DB, and I observed below issue. *Observation:* Inner join and full outer join combination on a table generating wrong result. SELECT * FROM lt; c1 1 2 (2 rows) SELECT * FROM ft; c1 1 2 (2 rows) \d+ ft Foreign table "public.ft" Column | Type | Modifiers | FDW Options | Storage | Stats target | Description +-+---+-+-+--+- c1 | integer | | | plain | | Server: link_server FDW Options: (table_name 'lt') --inner join and full outer join on local tables SELECT t1.c1,t2.c1,t3.c1 FROM lt t1 INNER JOIN lt t2 ON (t1.c1 = t2.c1) FULL JOIN lt t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 2 | 2 | 2 (2 rows) --inner join and full outer join on corresponding foreign tables SELECT t1.c1,t2.c1,t3.c1 FROM ft t1 INNER JOIN ft t2 ON (t1.c1 = t2.c1) FULL JOIN ft t3 ON (t2.c1 = t3.c1); c1 | c1 | c1 ++ 1 | 1 | 1 1 | 2 | 2 | 1 | 2 | 2 | 2 (4 rows) Thanks & Regards, Rajkumar Raghuwanshi QMG, EnterpriseDB Corporation >