Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
> On 08 Apr 2017, at 16:14, David Steelewrote: > > On 3/22/17 6:20 AM, Etsuro Fujita wrote: >> On 2017/02/22 19:57, Rushabh Lathia wrote: >>> Marked this as Ready for Committer. >> >> I noticed that this item in the CF app was incorrectly marked as >> Committed. This patch isn't committed, so I returned it to the previous >> status. I also rebased the patch. Attached is a new version of the patch. > > This submission has been moved to CF 2017-07. This patch has been marked Ready for Committer in the current commitfest without being committed or rejected. Moving to the next commitfest, but since it has bitrotted I’m moving it to Waiting for author. cheers ./daniel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 3/22/17 6:20 AM, Etsuro Fujita wrote: > On 2017/02/22 19:57, Rushabh Lathia wrote: >> Marked this as Ready for Committer. > > I noticed that this item in the CF app was incorrectly marked as > Committed. This patch isn't committed, so I returned it to the previous > status. I also rebased the patch. Attached is a new version of the patch. This submission has been moved to CF 2017-07. -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Wed, Mar 22, 2017 at 6:20 AM, Etsuro Fujitawrote: > On 2017/02/22 19:57, Rushabh Lathia wrote: >> Marked this as Ready for Committer. > > I noticed that this item in the CF app was incorrectly marked as Committed. > This patch isn't committed, so I returned it to the previous status. I also > rebased the patch. Attached is a new version of the patch. Sorry, I marked the wrong patch as committed. Apologies for that. This doesn't apply any more because of recent changes. git diff --check complains: contrib/postgres_fdw/postgres_fdw.c:3653: space before tab in indent. +/* Shouldn't contain the target relation. */ +Assert(target_rel == 0); This comment should give a reason. void deparseDirectUpdateSql(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, + RelOptInfo *foreignrel, List *targetlist, List *targetAttrs, List *remote_conds, Could you add a comment explaining the meaning of these various arguments? It takes rtindex, rel, and foreignrel, which apparently are all different things, but the meaning is not explained. /* + * Add a RETURNING clause, if needed, to an UPDATE/DELETE on a join. + */ +static void +deparseExplicitReturningList(List *rlist, + List **retrieved_attrs, + deparse_expr_cxt *context) +{ +deparseExplicitTargetList(rlist, true, retrieved_attrs, context); +} Do we really want to add a function for one line of code? +/* + * Look for conditions mentioning the target relation in the given join tree, + * which will be pulled up into the WHERE clause. Note that this is safe due + * to the same reason stated in comments in deparseFromExprForRel. + */ The comments for deparseFromExprForRel do not seem to address the topic of why this is safe. Also, the answer to the question "safe from what?" is not clear. -deparseReturningList(buf, root, rtindex, rel, false, - returningList, retrieved_attrs); +if (foreignrel->reloptkind == RELOPT_JOINREL) +deparseExplicitReturningList(returningList, retrieved_attrs, ); +else +deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); Why do these cases need to be handled differently? Maybe add a brief comment? +if ((outerrel->reloptkind == RELOPT_BASEREL && + outerrel->relid == target_rel) || +(innerrel->reloptkind == RELOPT_BASEREL && + innerrel->relid == target_rel)) 1. Surely it's redundant to check the RelOptKind if the RTI matches? 2. Generally, the tests in this patch against various RelOptKind values should be adapted to use the new macros introduced in 7a39b5e4d11229ece930a51fd7cb29e535db4494. The regression tests remove every remaining case where an update or delete gets fails to get pushed to the remote side. I think we should still test that path, because we've still got that code. Maybe use a non-pushable function in the join clause, or something. The new test cases could use some brief comments explaining their purpose. if (plan->returningLists) +{ returningList = (List *) list_nth(plan->returningLists, subplan_index); +/* + * If UPDATE/DELETE on a join, create a RETURNING list used in the + * remote query. + */ +if (fscan->scan.scanrelid == 0) +returningList = make_explicit_returning_list(resultRelation, + rel, + returningList); +} Again, the comment doesn't really explain why we're doing this. And initializing returningList twice in a row seems strange, too. I am unfortunately too tired to finish properly reviewing this tonight. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2017/02/22 19:57, Rushabh Lathia wrote: Marked this as Ready for Committer. I noticed that this item in the CF app was incorrectly marked as Committed. This patch isn't committed, so I returned it to the previous status. I also rebased the patch. Attached is a new version of the patch. Best regards, Etsuro Fujita postgres-fdw-more-update-pushdown-v4.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/03/17 2:35, Robert Haas wrote: And ... I don't see anything to complain about, so, committed. Thanks for committing, Robert! Thanks for reviewing, Ashutosh and David! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Mar 16, 2017 at 12:48 PM, Robert Haaswrote: > On Thu, Mar 16, 2017 at 11:46 AM, David Steele wrote: >> $ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch >> patching file contrib/postgres_fdw/deparse.c >> Hunk #11 succeeded at 1371 (offset -3 lines). >> Hunk #12 succeeded at 1419 (offset -3 lines). >> Hunk #13 succeeded at 1486 (offset -3 lines). >> Hunk #14 succeeded at 2186 (offset -3 lines). >> Hunk #15 succeeded at 3082 (offset -3 lines). >> patching file contrib/postgres_fdw/expected/postgres_fdw.out >> patching file contrib/postgres_fdw/postgres_fdw.c >> Hunk #1 succeeded at 669 (offset 1 line). >> Hunk #2 succeeded at 1245 (offset -1 lines). >> Hunk #3 succeeded at 2557 (offset -1 lines). >> Hunk #4 succeeded at 4157 (offset 3 lines). >> Hunk #5 succeeded at 4183 (offset 3 lines). >> Hunk #6 succeeded at 4212 (offset 3 lines). >> Hunk #7 succeeded at 4315 (offset 3 lines). >> patching file contrib/postgres_fdw/postgres_fdw.h >> patching file contrib/postgres_fdw/sql/postgres_fdw.sql >> >> Since these are just offsets I'll leave the patch as "Needs review" but >> an updated patch would be appreciated. > > I don't think that's really needed. Offsets don't hurt anything. > Even fuzz is OK. As long as the hunks are applying, I think it's > fine. > > Incidentally, I'm reading through this one now. And ... I don't see anything to complain about, so, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Mar 16, 2017 at 11:46 AM, David Steelewrote: > $ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch > patching file contrib/postgres_fdw/deparse.c > Hunk #11 succeeded at 1371 (offset -3 lines). > Hunk #12 succeeded at 1419 (offset -3 lines). > Hunk #13 succeeded at 1486 (offset -3 lines). > Hunk #14 succeeded at 2186 (offset -3 lines). > Hunk #15 succeeded at 3082 (offset -3 lines). > patching file contrib/postgres_fdw/expected/postgres_fdw.out > patching file contrib/postgres_fdw/postgres_fdw.c > Hunk #1 succeeded at 669 (offset 1 line). > Hunk #2 succeeded at 1245 (offset -1 lines). > Hunk #3 succeeded at 2557 (offset -1 lines). > Hunk #4 succeeded at 4157 (offset 3 lines). > Hunk #5 succeeded at 4183 (offset 3 lines). > Hunk #6 succeeded at 4212 (offset 3 lines). > Hunk #7 succeeded at 4315 (offset 3 lines). > patching file contrib/postgres_fdw/postgres_fdw.h > patching file contrib/postgres_fdw/sql/postgres_fdw.sql > > Since these are just offsets I'll leave the patch as "Needs review" but > an updated patch would be appreciated. I don't think that's really needed. Offsets don't hurt anything. Even fuzz is OK. As long as the hunks are applying, I think it's fine. Incidentally, I'm reading through this one now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 1/30/17 6:30 AM, Etsuro Fujita wrote: > On 2017/01/27 21:25, Etsuro Fujita wrote: >> On 2017/01/27 20:04, Ashutosh Bapat wrote: >>> I think we should pick up your patch on >>> 27th December, update the comment per your mail on 5th Jan. I will >>> review it once and list down the things left to committer's judgement. > >> Sorry, I started thinking we went in the wrong direction. I added to >> deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for >> each subquery present in a given join tree prior to deparsing a whole >> remote query. But that's nothing but an overhead; we need to create a >> tlist for the top-level query because we use it as fdw_scan_tlist, but >> not for subqueries, and we need to create retrieved_attrs for the >> top-level query while deparsing the targetlist in >> deparseExplicitTargetList, but not for subqueries. So, we should need >> some work to avoid such a useless overhead. > > I think we can avoid the useless overhead by adding a new function, > deparseSubqueryTargetList, that deparses expressions in the given > relation's reltarget, not the tlist, as a SELECT clause of the subquery > representing the given relation. That would also allow us to make the > 1-to-1 relationship between the subquery output columns and their > aliases more explicit, which was your original comment. Please find > attached the new version. (The patch doesn't need the patch to avoid > duplicate construction of the tlist, discussed upthread.) > > Other changes: > * I went back to make_outerrel_subquery and make_innerrel_subquery, > which are flags to indicate whether to deparse the input relations as > subqueries. is_subquery_rel would work well for handling the cases of > full joins with restrictions on the input relations, but I noticed that > that wouldn't work well when extending to handle the cases where we > deparse the input relations as subqueries to evaluate PHVs remotely. > * Since appendSubqueryAlias in the previous patch is pretty small, I > included the code into deparseRangeTblRef. This patch does not apply cleanly at cccbdde: $ patch -p1 < ../other/postgres-fdw-subquery-support-v15.patch patching file contrib/postgres_fdw/deparse.c Hunk #11 succeeded at 1371 (offset -3 lines). Hunk #12 succeeded at 1419 (offset -3 lines). Hunk #13 succeeded at 1486 (offset -3 lines). Hunk #14 succeeded at 2186 (offset -3 lines). Hunk #15 succeeded at 3082 (offset -3 lines). patching file contrib/postgres_fdw/expected/postgres_fdw.out patching file contrib/postgres_fdw/postgres_fdw.c Hunk #1 succeeded at 669 (offset 1 line). Hunk #2 succeeded at 1245 (offset -1 lines). Hunk #3 succeeded at 2557 (offset -1 lines). Hunk #4 succeeded at 4157 (offset 3 lines). Hunk #5 succeeded at 4183 (offset 3 lines). Hunk #6 succeeded at 4212 (offset 3 lines). Hunk #7 succeeded at 4315 (offset 3 lines). patching file contrib/postgres_fdw/postgres_fdw.h patching file contrib/postgres_fdw/sql/postgres_fdw.sql Since these are just offsets I'll leave the patch as "Needs review" but an updated patch would be appreciated. Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Wed, Feb 22, 2017 at 12:15 PM, Etsuro Fujitawrote: > On 2017/02/21 19:31, Rushabh Lathia wrote: > >> On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujita >> > wrote: >> > > On 2017/02/13 18:24, Rushabh Lathia wrote: >> Here are few comments: >> >> 1) >> >> @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState >> PGresult *result;/* result for query */ >> intnum_tuples;/* # of result tuples */ >> intnext_tuple;/* index of next one to >> return */ >> +RelationresultRel;/* relcache entry for the >> target table */ >> >> >> Why we need resultRel? Can't we directly use dmstate->rel ? >> >> >> The reason why we need that is because in get_returning_data, we >> pass dmstate->rel to make_tuple_from_result_row, which requires that >> dmstate->rel be NULL when the scan tuple is described by >> fdw_scan_tlist. So in that case we set dmstate->rel to NULL and >> have dmstate->resultRel that is the relcache entry for the target >> relation in postgresBeginDirectModify. >> > > Thanks for the explanation. We might do something here by using >> fdw_scan_tlist or changing the assumption of >> make_tuple_from_result_row(), and that way we can avoid two similar >> variable pointer in the PgFdwDirectModifyState. >> > > I agree that the two similar variables are annoying, to some extent, but > ISTM that is not that bad because that makes the handling of dmstate->rel > consistent with that of PgFdwScanState's rel. As you know, > PgFdwDirectModifyState is defined in a similar way as PgFdwScanState, in > which the rel is a relcache entry for the foreign table for a simple > foreign table scan and NULL for a foreign join scan (see comments for the > definition of PgFdwScanState). > > I am okay with currently also, but it adding a note somewhere about this >> would be great. Also let keep this point open for the committer, if >> committer feel this is good then lets go ahead with this. >> > > Agreed. > > Thanks. Marked this as Ready for Committer. > Here are few other cosmetic changes: >> >> 1) >> >> + * >> + * 'target_rel' is either zero or the rangetable index of a target >> relation. >> + * In the latter case this construncts FROM clause of UPDATE or USING >> clause >> + * of DELETE by simply ignoring the target relation while deparsing the >> given >> >> Spell correction: - construncts >> >> 2) >> >> +/* >> + * If either input is the target relation, get all the >> joinclauses. >> + * Otherwise extract conditions mentioning the target relation >> from >> + * the joinclauses. >> + */ >> >> >> space between joinclauses needed. >> >> 3) >> >> +/* >> + * If UPDATE/DELETE on a join, create a RETURINING list used in >> the >> + * remote query. >> + */ >> +if (fscan->scan.scanrelid == 0) >> +returningList = make_explicit_returning_list(resultRelation, >> + rel, >> + returningList); >> >> >> Spell correction: RETURINING >> > > Good catch! > > I did above changes in the attached patch. Please have a look once and >> > > I'm fine with that. Thanks for the patch! > > then I feel like this patch is ready for committer. >> > > Thanks for reviewing! > > Best regards, > Etsuro Fujita > > > -- Rushabh Lathia
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2017/02/21 19:31, Rushabh Lathia wrote: On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujita> wrote: On 2017/02/13 18:24, Rushabh Lathia wrote: Here are few comments: 1) @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState PGresult *result;/* result for query */ intnum_tuples;/* # of result tuples */ intnext_tuple;/* index of next one to return */ +RelationresultRel;/* relcache entry for the target table */ Why we need resultRel? Can't we directly use dmstate->rel ? The reason why we need that is because in get_returning_data, we pass dmstate->rel to make_tuple_from_result_row, which requires that dmstate->rel be NULL when the scan tuple is described by fdw_scan_tlist. So in that case we set dmstate->rel to NULL and have dmstate->resultRel that is the relcache entry for the target relation in postgresBeginDirectModify. Thanks for the explanation. We might do something here by using fdw_scan_tlist or changing the assumption of make_tuple_from_result_row(), and that way we can avoid two similar variable pointer in the PgFdwDirectModifyState. I agree that the two similar variables are annoying, to some extent, but ISTM that is not that bad because that makes the handling of dmstate->rel consistent with that of PgFdwScanState's rel. As you know, PgFdwDirectModifyState is defined in a similar way as PgFdwScanState, in which the rel is a relcache entry for the foreign table for a simple foreign table scan and NULL for a foreign join scan (see comments for the definition of PgFdwScanState). I am okay with currently also, but it adding a note somewhere about this would be great. Also let keep this point open for the committer, if committer feel this is good then lets go ahead with this. Agreed. Here are few other cosmetic changes: 1) + * + * 'target_rel' is either zero or the rangetable index of a target relation. + * In the latter case this construncts FROM clause of UPDATE or USING clause + * of DELETE by simply ignoring the target relation while deparsing the given Spell correction: - construncts 2) +/* + * If either input is the target relation, get all the joinclauses. + * Otherwise extract conditions mentioning the target relation from + * the joinclauses. + */ space between joinclauses needed. 3) +/* + * If UPDATE/DELETE on a join, create a RETURINING list used in the + * remote query. + */ +if (fscan->scan.scanrelid == 0) +returningList = make_explicit_returning_list(resultRelation, + rel, + returningList); Spell correction: RETURINING Good catch! I did above changes in the attached patch. Please have a look once and I'm fine with that. Thanks for the patch! then I feel like this patch is ready for committer. Thanks for reviewing! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Mon, Feb 20, 2017 at 1:41 PM, Etsuro Fujitawrote: > On 2017/02/13 18:24, Rushabh Lathia wrote: > >> I started reviewing the patch again. Patch applied cleanly on latest >> source >> as well as regression pass through with the patch. I also performed >> few manual testing and haven't found any regression. Patch look >> much cleaner the earlier version, and don't have any major concern as >> such. >> > > Thanks for the review! > > Here are few comments: >> >> 1) >> >> @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState >> PGresult *result;/* result for query */ >> intnum_tuples;/* # of result tuples */ >> intnext_tuple;/* index of next one to return */ >> +RelationresultRel;/* relcache entry for the target table >> */ >> > > Why we need resultRel? Can't we directly use dmstate->rel ? >> > > The reason why we need that is because in get_returning_data, we pass > dmstate->rel to make_tuple_from_result_row, which requires that > dmstate->rel be NULL when the scan tuple is described by fdw_scan_tlist. > So in that case we set dmstate->rel to NULL and have dmstate->resultRel > that is the relcache entry for the target relation in > postgresBeginDirectModify. > > Thanks for the explanation. We might do something here by using fdw_scan_tlist or changing the assumption of make_tuple_from_result_row(), and that way we can avoid two similar variable pointer in the PgFdwDirectModifyState. I am okay with currently also, but it adding a note somewhere about this would be great. Also let keep this point open for the committer, if committer feel this is good then lets go ahead with this. Here are few other cosmetic changes: 1) + * + * 'target_rel' is either zero or the rangetable index of a target relation. + * In the latter case this construncts FROM clause of UPDATE or USING clause + * of DELETE by simply ignoring the target relation while deparsing the given Spell correction: - construncts 2) +/* + * If either input is the target relation, get all the joinclauses. + * Otherwise extract conditions mentioning the target relation from + * the joinclauses. + */ space between joinclauses needed. 3) +/* + * If UPDATE/DELETE on a join, create a RETURINING list used in the + * remote query. + */ +if (fscan->scan.scanrelid == 0) +returningList = make_explicit_returning_list(resultRelation, + rel, + returningList); Spell correction: RETURINING I did above changes in the attached patch. Please have a look once and then I feel like this patch is ready for committer. Thanks, Rushabh Lathia postgres-fdw-more-update-pushdown-v3_cosmetic_changes.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2017/02/13 18:24, Rushabh Lathia wrote: I started reviewing the patch again. Patch applied cleanly on latest source as well as regression pass through with the patch. I also performed few manual testing and haven't found any regression. Patch look much cleaner the earlier version, and don't have any major concern as such. Thanks for the review! Here are few comments: 1) @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState PGresult *result;/* result for query */ intnum_tuples;/* # of result tuples */ intnext_tuple;/* index of next one to return */ +RelationresultRel;/* relcache entry for the target table */ Why we need resultRel? Can't we directly use dmstate->rel ? The reason why we need that is because in get_returning_data, we pass dmstate->rel to make_tuple_from_result_row, which requires that dmstate->rel be NULL when the scan tuple is described by fdw_scan_tlist. So in that case we set dmstate->rel to NULL and have dmstate->resultRel that is the relcache entry for the target relation in postgresBeginDirectModify. 2) In the patch somewhere scanrelid condition being used as fscan->scan.scanrelid == 0 where as some place its been used as fsplan->scan.scanrelid > 0. Infact in the same function its been used differently example postgresBeginDirectModify. Can make this consistent. Ok, done. 3) + * If UPDATE/DELETE on a join, create a RETURINING list used in the remote + * query. + */ +if (fscan->scan.scanrelid == 0) +returningList = make_explicit_returning_list(resultRelation, rel, + returningList); + Above block can be moved inside the if (plan->returningLists) condition above the block. Like this: /* * Extract the relevant RETURNING list if any. */ if (plan->returningLists) { returningList = (List *) list_nth(plan->returningLists, subplan_index); /* * If UPDATE/DELETE on a join, create a RETURINING list used in the remote * query. */ if (fscan->scan.scanrelid == 0) returningList = make_explicit_returning_list(resultRelation, rel, returningList); } Done that way. Another thing I noticed is duplicate work in apply_returning_filter; it initializes tableoid of an updated/deleted tuple if needed, but the core will do that (see ExecProcessReturning). I removed that work from apply_returning_filter. I am still doing few more testing with the patch, if I will found anything apart from this I will raise that into another mail. Thanks again! Attached is an updated version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,142 static void deparseTargetList(StringInfo buf, Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); --- 130,151 Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, ! bool is_returning, ! List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); + static void deparseExplicitReturningList(List *rlist, + List **retrieved_attrs, + deparse_expr_cxt *context); + static void pull_up_target_conditions(PlannerInfo *root, RelOptInfo *foreignrel, + Index target_rel, List **target_conds); + static void extract_target_conditions(List **joinclauses, Index target_rel, + List **target_conds); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); *** *** 165,171 static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); static
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
Sorry for delay in the review. I started reviewing the patch again. Patch applied cleanly on latest source as well as regression pass through with the patch. I also performed few manual testing and haven't found any regression. Patch look much cleaner the earlier version, and don't have any major concern as such. Here are few comments: 1) @@ -211,6 +211,12 @@ typedef struct PgFdwDirectModifyState PGresult *result;/* result for query */ intnum_tuples;/* # of result tuples */ intnext_tuple;/* index of next one to return */ +RelationresultRel;/* relcache entry for the target table */ Why we need resultRel? Can't we directly use dmstate->rel ? 2) In the patch somewhere scanrelid condition being used as fscan->scan.scanrelid == 0 where as some place its been used as fsplan->scan.scanrelid > 0. Infact in the same function its been used differently example postgresBeginDirectModify. Can make this consistent. 3) + * If UPDATE/DELETE on a join, create a RETURINING list used in the remote + * query. + */ +if (fscan->scan.scanrelid == 0) +returningList = make_explicit_returning_list(resultRelation, rel, + returningList); + Above block can be moved inside the if (plan->returningLists) condition above the block. Like this: /* * Extract the relevant RETURNING list if any. */ if (plan->returningLists) { returningList = (List *) list_nth(plan->returningLists, subplan_index); /* * If UPDATE/DELETE on a join, create a RETURINING list used in the remote * query. */ if (fscan->scan.scanrelid == 0) returningList = make_explicit_returning_list(resultRelation, rel, returningList); } I am still doing few more testing with the patch, if I will found anything apart from this I will raise that into another mail. Thanks, On Wed, Feb 1, 2017 at 11:50 AM, Michael Paquierwrote: > On Wed, Jan 25, 2017 at 7:20 PM, Etsuro Fujita > wrote: > > Attached is the new version of the patch. I also addressed other > comments > > from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c, > > added/revised comments, and added regression tests for the case where a > > pushed down UPDATE/DELETE on a join has RETURNING. > > > > My apologies for having been late to work on this. > > Moved to CF 2017-03. > -- > Michael > -- Rushabh Lathia
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Wed, Jan 25, 2017 at 7:20 PM, Etsuro Fujitawrote: > Attached is the new version of the patch. I also addressed other comments > from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c, > added/revised comments, and added regression tests for the case where a > pushed down UPDATE/DELETE on a join has RETURNING. > > My apologies for having been late to work on this. Moved to CF 2017-03. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Jan 30, 2017 at 8:30 PM, Etsuro Fujitawrote: > Other changes: > * I went back to make_outerrel_subquery and make_innerrel_subquery, which > are flags to indicate whether to deparse the input relations as subqueries. > is_subquery_rel would work well for handling the cases of full joins with > restrictions on the input relations, but I noticed that that wouldn't work > well when extending to handle the cases where we deparse the input relations > as subqueries to evaluate PHVs remotely. > * Since appendSubqueryAlias in the previous patch is pretty small, I > included the code into deparseRangeTblRef. The patch is very fresh, so moved to CF 2017-03. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/30 21:05, Ashutosh Bapat wrote: On Mon, Jan 30, 2017 at 5:00 PM, Etsuro Fujitawrote: On 2017/01/27 21:25, Etsuro Fujita wrote: Sorry, I started thinking we went in the wrong direction. I added to deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for each subquery present in a given join tree prior to deparsing a whole remote query. But that's nothing but an overhead; we need to create a tlist for the top-level query because we use it as fdw_scan_tlist, but not for subqueries, and we need to create retrieved_attrs for the top-level query while deparsing the targetlist in deparseExplicitTargetList, but not for subqueries. So, we should need some work to avoid such a useless overhead. I think we can avoid the useless overhead by adding a new function, deparseSubqueryTargetList, that deparses expressions in the given relation's reltarget, not the tlist, as a SELECT clause of the subquery representing the given relation. That would also allow us to make the 1-to-1 relationship between the subquery output columns and their aliases more explicit, which was your original comment. Please find attached the new version. (The patch doesn't need the patch to avoid duplicate construction of the tlist, discussed upthread.) I have not looked at the patch, but the reason we use a tlist instead of list of expressions is because fdw_scan_tlist is expected in that form Actually, we wouldn't need that to deparse a remote join query; a list of expressions would be enough. and we don't want two different representations one to deparse SELECT and one to interpret results from the foreign server. What you describe above seems to introduce exactly that hazard. I agree with you to some extent, BUT: * I don't think it's a good idea to create a tlist for each base/join relation that is deparsed as a subquery, to just avoid that hazard. As I said above, that's nothing but an overhead. * I think we would need to have two different representations for at least base relations; we use fpinfo->attrs_used to deparse a simple foreign table scan query for a base relation, but when deparsing the base relation as a subquery, we would need to use the list of expressions in the base relation's reltarget, to deparse a SELECT clause of the subquery, because we need to deparse a whole-row reference to the base relation as ROW, not all the user columns expanded, as shown in this extracted from the regression tests in the patch: + EXPLAIN (VERBOSE, COSTS OFF) + SELECT t1.c1, ss.a, ss.b FROM (SELECT c1 FROM "S 1"."T 3" WHERE c1 = 50) t1 INNER JOIN (SELECT t2.c1, t3.c1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t2 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t3 ON (t2.c1 = t3.c1) WHERE t2.c1 IS NULL OR t2.c1 IS NOT NULL) ss(a, b) ON (TRUE) ORDER BY t1.c1, ss.a, ss.b FOR UPDATE OF t1; + QUERY PLAN + + LockRows +Output: "T 3".c1, ft4.c1, ft5.c1, "T 3".ctid, ft4.*, ft5.* +-> Nested Loop + Output: "T 3".c1, ft4.c1, ft5.c1, "T 3".ctid, ft4.*, ft5.* + -> Foreign Scan +Output: ft4.c1, ft4.*, ft5.c1, ft5.* +Relations: (public.ft4) FULL JOIN (public.ft5) +Remote SQL: SELECT s8.c1, s8.c2, s9.c1, s9.c2 FROM ((SELECT c1, ROW(c1, c2, c3) FROM "S 1"."T 3" WHERE ((c1 >= 50)) AND ((c1 <= 60))) s8(c1, c2) FULL JOIN (SELECT c1, ROW(c1, c2, c3) FROM "S 1"."T 4" WHERE ((c1 >= 50)) AND ((c1 <= 60))) s9(c1, c2) ON (((s8.c1 = s9.c1 WHERE (((s8.c1 IS NULL) OR (s8.c1 IS NOT NULL))) ORDER BY s8.c1 ASC NULLS LAST, s9.c1 ASC NULLS LAST +-> Hash Full Join + Output: ft4.c1, ft4.*, ft5.c1, ft5.* + Hash Cond: (ft4.c1 = ft5.c1) + Filter: ((ft4.c1 IS NULL) OR (ft4.c1 IS NOT NULL)) + -> Foreign Scan on public.ft4 +Output: ft4.c1, ft4.* +Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 3" WHERE ((c1 >= 50)) AND ((c1 <= 60)) + -> Hash +Output: ft5.c1, ft5.* +-> Foreign Scan on public.ft5 + Output: ft5.c1, ft5.* + Remote SQL: SELECT c1, c2, c3 FROM "S 1"."T 4" WHERE ((c1 >= 50)) AND ((c1 <= 60)) + -> Materialize +Output: "T 3".c1, "T 3".ctid +-> Seq Scan on "S 1"."T 3" + Output: "T 3".c1,
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Jan 30, 2017 at 5:00 PM, Etsuro Fujitawrote: > On 2017/01/27 21:25, Etsuro Fujita wrote: >> >> On 2017/01/27 20:04, Ashutosh Bapat wrote: >>> >>> I think we should pick up your patch on >>> 27th December, update the comment per your mail on 5th Jan. I will >>> review it once and list down the things left to committer's judgement. > > >> Sorry, I started thinking we went in the wrong direction. I added to >> deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for >> each subquery present in a given join tree prior to deparsing a whole >> remote query. But that's nothing but an overhead; we need to create a >> tlist for the top-level query because we use it as fdw_scan_tlist, but >> not for subqueries, and we need to create retrieved_attrs for the >> top-level query while deparsing the targetlist in >> deparseExplicitTargetList, but not for subqueries. So, we should need >> some work to avoid such a useless overhead. > > > I think we can avoid the useless overhead by adding a new function, > deparseSubqueryTargetList, that deparses expressions in the given relation's > reltarget, not the tlist, as a SELECT clause of the subquery representing > the given relation. That would also allow us to make the 1-to-1 > relationship between the subquery output columns and their aliases more > explicit, which was your original comment. Please find attached the new > version. (The patch doesn't need the patch to avoid duplicate construction > of the tlist, discussed upthread.) I have not looked at the patch, but the reason we use a tlist instead of list of expressions is because fdw_scan_tlist is expected in that form and we don't want two different representations one to deparse SELECT and one to interpret results from the foreign server. What you describe above seems to introduce exactly that hazard. > > Other changes: > * I went back to make_outerrel_subquery and make_innerrel_subquery, which > are flags to indicate whether to deparse the input relations as subqueries. > is_subquery_rel would work well for handling the cases of full joins with > restrictions on the input relations, but I noticed that that wouldn't work > well when extending to handle the cases where we deparse the input relations > as subqueries to evaluate PHVs remotely. I had objected to using a single variable instead of two previously and you had argued against it in [1]. There you had mentioned that PHV doesn't need two variables, but now you are taking the other stand, without any apparent reason. Can you please clarify it? [1] https://www.postgresql.org/message-id/1eb58ee4-8ffa-7c40-1229-c8973e649...@lab.ntt.co.jp -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/27 21:25, Etsuro Fujita wrote: On 2017/01/27 20:04, Ashutosh Bapat wrote: I think we should pick up your patch on 27th December, update the comment per your mail on 5th Jan. I will review it once and list down the things left to committer's judgement. Sorry, I started thinking we went in the wrong direction. I added to deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for each subquery present in a given join tree prior to deparsing a whole remote query. But that's nothing but an overhead; we need to create a tlist for the top-level query because we use it as fdw_scan_tlist, but not for subqueries, and we need to create retrieved_attrs for the top-level query while deparsing the targetlist in deparseExplicitTargetList, but not for subqueries. So, we should need some work to avoid such a useless overhead. I think we can avoid the useless overhead by adding a new function, deparseSubqueryTargetList, that deparses expressions in the given relation's reltarget, not the tlist, as a SELECT clause of the subquery representing the given relation. That would also allow us to make the 1-to-1 relationship between the subquery output columns and their aliases more explicit, which was your original comment. Please find attached the new version. (The patch doesn't need the patch to avoid duplicate construction of the tlist, discussed upthread.) Other changes: * I went back to make_outerrel_subquery and make_innerrel_subquery, which are flags to indicate whether to deparse the input relations as subqueries. is_subquery_rel would work well for handling the cases of full joins with restrictions on the input relations, but I noticed that that wouldn't work well when extending to handle the cases where we deparse the input relations as subqueries to evaluate PHVs remotely. * Since appendSubqueryAlias in the previous patch is pretty small, I included the code into deparseRangeTblRef. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SUBQUERY_REL_ALIAS_PREFIX "s" + #define SUBQUERY_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 132,137 static void deparseTargetList(StringInfo buf, --- 134,140 List **retrieved_attrs); static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); + static void deparseSubqueryTargetList(deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, *** *** 159,165 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); --- 162,168 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 170,178 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, bool make_subquery, + List **params_list); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 175,180 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); --- 181,194 static Node *deparseSortGroupClause(Index ref, List *tlist, deparse_expr_cxt *context); + /* + * Helper functions + */ + static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, + int *relno, int *colno); + static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel, + int *relno,
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/27 20:04, Ashutosh Bapat wrote: On Fri, Jan 13, 2017 at 12:30 PM, Etsuro Fujitawrote: A more clean way I'm thinking is: (1) in postgresGetForeignJoinPaths(), create a tlist by build_tlist_to_deparse() and save it in fpinfo->tlist before estimate_path_cost_size() if use_remote_estimates=true, (2) in estimate_path_cost_size(), use the fpinfo->tlist if use_remote_estimates=true, and (3) in postgresGetForeignPlan(), use the fpinfo->tlist as the fdw_scan_tlist if use_remote_estimates=true, otherwise create a tlist as the fdw_scan_tlist by build_tlist_to_deparse(), like the attached. What do you think about that? Another change is: I simplified build_tlist_to_deparse() a bit and put that in postgres_fdw.c because that is used only in postgres_fdw.c. I still think we should discuss this separately because this is an existing issue and that makes it easy to review the patch. If the attached is the right way to go, I'll update the join-pushdown patch on top of the patch. I don't think it's right to assume that the targetlist will be available only when use_remote_estimate is true; for grouping it's certainly not. My explanation was not enough. Sorry about that. My proposal described above was for join relations, not upper relations. We build the targetlist of an upper relation during postgresGetForeignUpperPaths, so for grouping I think we should use that targetlist in estimate_path_cost_size and postgresGetForeignPlan. The patch was created that way. But I don't see this discussion getting anywhere. I will leave it to the committer's judgement. I'm fine with that. I think we should pick up your patch on 27th December, update the comment per your mail on 5th Jan. I will review it once and list down the things left to committer's judgement. Sorry, I started thinking we went in the wrong direction. I added to deparseSelectStmtForRel build_subquery_tlists, which creates a tlist for each subquery present in a given join tree prior to deparsing a whole remote query. But that's nothing but an overhead; we need to create a tlist for the top-level query because we use it as fdw_scan_tlist, but not for subqueries, and we need to create retrieved_attrs for the top-level query while deparsing the targetlist in deparseExplicitTargetList, but not for subqueries. So, we should need some work to avoid such a useless overhead. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Fri, Jan 13, 2017 at 12:30 PM, Etsuro Fujitawrote: > On 2017/01/12 18:25, Ashutosh Bapat wrote: >> >> On Wed, Jan 11, 2017 at 5:45 PM, Etsuro Fujita >> wrote: > > > On 2017/01/05 21:11, Ashutosh Bapat wrote: >> >> IIUC, for a relation with use_remote_estimates we will deparse the >> query twice and will build the targetlist twice. > > >>> While working on this, I noticed a weird case. Consider: >>> >>> postgres=# explain verbose select 1 from ft1 left join ft2 on (ft1.a = >>> ft2.a) inner join test on (true); >>>QUERY PLAN >>> >>> - >>> Nested Loop (cost=100.00..103.06 rows=1 width=4) >>>Output: 1 >>>-> Foreign Scan (cost=100.00..102.04 rows=1 width=0) >>> Relations: (public.ft1) LEFT JOIN (public.ft2) >>> Remote SQL: SELECT NULL FROM (public.t1 r1 LEFT JOIN public.t2 >>> r2 >>> ON (((r1.a = r2.a >>>-> Seq Scan on public.test (cost=0.00..1.01 rows=1 width=0) >>> Output: test.a, test.b >>> (7 rows) >>> >>> In this case the fpinfo->tlist of the foreign join is NIL, so whether or >>> not >>> the tlist is already built cannot be discriminated by the fpinfo->tlist. >>> We >>> might need another flag to show that the tlist has been built already. >>> Since this is an existing issue and we would need to give careful thought >>> to >>> this, so I'd like to leave this for another patch. > > >> I think in that case, relation's targetlist will also be NIL or >> contain no Var node. It wouldn't be expensive to build it again and >> again. That's better than maintaining a new flag. > > > I think that's ugly. A more clean way I'm thinking is: (1) in > postgresGetForeignJoinPaths(), create a tlist by build_tlist_to_deparse() > and save it in fpinfo->tlist before estimate_path_cost_size() if > use_remote_estimates=true, (2) in estimate_path_cost_size(), use the > fpinfo->tlist if use_remote_estimates=true, and (3) in > postgresGetForeignPlan(), use the fpinfo->tlist as the fdw_scan_tlist if > use_remote_estimates=true, otherwise create a tlist as the fdw_scan_tlist by > build_tlist_to_deparse(), like the attached. > > What do you think about that? > > Another change is: I simplified build_tlist_to_deparse() a bit and put that > in postgres_fdw.c because that is used only in postgres_fdw.c. > > I still think we should discuss this separately because this is an existing > issue and that makes it easy to review the patch. If the attached is the > right way to go, I'll update the join-pushdown patch on top of the patch. I don't think it's right to assume that the targetlist will be available only when use_remote_estimate is true; for grouping it's certainly not. But I don't see this discussion getting anywhere. I will leave it to the committer's judgement. I think we should pick up your patch on 27th December, update the comment per your mail on 5th Jan. I will review it once and list down the things left to committer's judgement. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/11/30 17:29, Etsuro Fujita wrote: On 2016/11/23 20:28, Rushabh Lathia wrote: I wrote: How about inserting that before the out param **retrieved_attrs, like this? static void deparseExplicitTargetList(List *tlist, bool is_returning, List **retrieved_attrs, deparse_expr_cxt *context); Yes, adding it before retrieved_attrs would be good. OK, will do. Done. You wrote: 5) make_explicit_returning_list() pull the var list from the returningList and build the targetentry for the returning list and at the end rewrites the fdw_scan_tlist. AFAIK, in case of DML - which is going to pushdown to the remote server ideally fdw_scan_tlist should be same as returning list, as final output for the query is query will be RETUNING list only. isn't that true? I wrote: That would be true. But the fdw_scan_tlist passed from the core would contain junk columns inserted by the rewriter and planner work, such as CTID for the target table and whole-row Vars for other tables specified in the FROM clause of an UPDATE or the USING clause of a DELETE. So, I created the patch so that the pushed-down UPDATE/DELETE retrieves only the data needed for the RETURNING calculation from the remote server, not the whole fdw_scan_tlist data. Yes, I noticed that fdw_scan_tlist have CTID for the target table and whole-raw vars for other tables specified in the FROM clause of the DML. But I was thinking isn't it possible to create new fdw_scan_tlist once we found that DML is direct pushable to foreign server? I tried quickly doing that - but later its was throwing "variable not found in subplan target list" from set_foreignscan_references(). We could probably avoid that error by replacing the targetlist of the subplan with fdw_scan_tlist, but that wouldn't be enough ... You wrote: If yes, then why can't we directly replace the fdw_scan_tlist with the returning list, rather then make_explicit_returning_list()? I wrote: I think we could do that if we modified the core (maybe the executor part). I'm not sure that's a good idea, though. Yes modifying core is not good idea. But just want to understand why you think that this need need to modify core? Sorry, I don't remember that. Will check. The reason why I think so is that the ModifyTable node on top of the ForeignScan node requires that the targetlist of the ForeignScan has (1) junk whole-row Vars for secondary relations in UPDATE/DELETE and (2) all attributes of the target relation to produce the new tuple for UPDATE. (So, it wouldn't be enough to just replace the ForeignScan's targetlist with fdw_scan_tlist!) For #1, see this (and the following code) in ExecInitModifyTable: /* * If we have any secondary relations in an UPDATE or DELETE, they need to * be treated like non-locked relations in SELECT FOR UPDATE, ie, the * EvalPlanQual mechanism needs to be told about them. Locate the * relevant ExecRowMarks. */ And for #2, see this (and the following code, especially where calling ExecCheckPlanOutput) in the same function: * This section of code is also a convenient place to verify that the * output of an INSERT or UPDATE matches the target table(s). What you proposed would be a good idea because the FDW could calculate the user-query RETURNING list more efficiently in some cases, but I'd like to leave that for future work. Attached is the new version of the patch. I also addressed other comments from you: moved rewriting the fdw_scan_tlist to postgres_fdw.c, added/revised comments, and added regression tests for the case where a pushed down UPDATE/DELETE on a join has RETURNING. My apologies for having been late to work on this. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,142 static void deparseTargetList(StringInfo buf, Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); --- 130,151 Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, ! bool is_returning, ! List **retrieved_attrs, deparse_expr_cxt *context);
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/12 18:25, Ashutosh Bapat wrote: On Wed, Jan 11, 2017 at 5:45 PM, Etsuro Fujitawrote: On 2017/01/05 21:11, Ashutosh Bapat wrote: IIUC, for a relation with use_remote_estimates we will deparse the query twice and will build the targetlist twice. While working on this, I noticed a weird case. Consider: postgres=# explain verbose select 1 from ft1 left join ft2 on (ft1.a = ft2.a) inner join test on (true); QUERY PLAN - Nested Loop (cost=100.00..103.06 rows=1 width=4) Output: 1 -> Foreign Scan (cost=100.00..102.04 rows=1 width=0) Relations: (public.ft1) LEFT JOIN (public.ft2) Remote SQL: SELECT NULL FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a -> Seq Scan on public.test (cost=0.00..1.01 rows=1 width=0) Output: test.a, test.b (7 rows) In this case the fpinfo->tlist of the foreign join is NIL, so whether or not the tlist is already built cannot be discriminated by the fpinfo->tlist. We might need another flag to show that the tlist has been built already. Since this is an existing issue and we would need to give careful thought to this, so I'd like to leave this for another patch. I think in that case, relation's targetlist will also be NIL or contain no Var node. It wouldn't be expensive to build it again and again. That's better than maintaining a new flag. I think that's ugly. A more clean way I'm thinking is: (1) in postgresGetForeignJoinPaths(), create a tlist by build_tlist_to_deparse() and save it in fpinfo->tlist before estimate_path_cost_size() if use_remote_estimates=true, (2) in estimate_path_cost_size(), use the fpinfo->tlist if use_remote_estimates=true, and (3) in postgresGetForeignPlan(), use the fpinfo->tlist as the fdw_scan_tlist if use_remote_estimates=true, otherwise create a tlist as the fdw_scan_tlist by build_tlist_to_deparse(), like the attached. What do you think about that? Another change is: I simplified build_tlist_to_deparse() a bit and put that in postgres_fdw.c because that is used only in postgres_fdw.c. I still think we should discuss this separately because this is an existing issue and that makes it easy to review the patch. If the attached is the right way to go, I'll update the join-pushdown patch on top of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 843,883 deparse_type_name(Oid type_oid, int32 typemod) } /* - * Build the targetlist for given relation to be deparsed as SELECT clause. - * - * The output targetlist contains the columns that need to be fetched from the - * foreign server for the given relation. If foreignrel is an upper relation, - * then the output targetlist can also contain expressions to be evaluated on - * foreign server. - */ - List * - build_tlist_to_deparse(RelOptInfo *foreignrel) - { - List *tlist = NIL; - PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; - - /* - * For an upper relation, we have already built the target list while - * checking shippability, so just return that. - */ - if (foreignrel->reloptkind == RELOPT_UPPER_REL) - return fpinfo->grouped_tlist; - - /* - * We require columns specified in foreignrel->reltarget->exprs and those - * required for evaluating the local conditions. - */ - tlist = add_to_flat_tlist(tlist, - pull_var_clause((Node *) foreignrel->reltarget->exprs, - PVC_RECURSE_PLACEHOLDERS)); - tlist = add_to_flat_tlist(tlist, - pull_var_clause((Node *) fpinfo->local_conds, - PVC_RECURSE_PLACEHOLDERS)); - - return tlist; - } - - /* * Deparse SELECT statement for given relation into buf. * * tlist contains the list of desired columns to be fetched from foreign server. --- 843,848 *** a/contrib/postgres_fdw/postgres_fdw.c --- b/contrib/postgres_fdw/postgres_fdw.c *** *** 406,411 static void conversion_error_callback(void *arg); --- 406,412 static bool foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype, RelOptInfo *outerrel, RelOptInfo *innerrel, JoinPathExtraData *extra); + static List *build_tlist_to_deparse(RelOptInfo *foreignrel); static bool foreign_grouping_ok(PlannerInfo *root, RelOptInfo *grouped_rel); static List *get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel); *** *** 1190,1197 postgresGetForeignPlan(PlannerInfo *root, remote_conds = fpinfo->remote_conds; local_exprs = fpinfo->local_conds; ! /* Build the list of columns to be fetched from the foreign server. */ ! fdw_scan_tlist = build_tlist_to_deparse(foreignrel); /* * Ensure that the outer plan produces a tuple whose descriptor --- 1191,1209
Re: [HACKERS] Push down more full joins in postgres_fdw
On Wed, Jan 11, 2017 at 5:45 PM, Etsuro Fujitawrote: > On 2017/01/05 21:38, Ashutosh Bapat wrote: >> >> On Thu, Jan 5, 2017 at 5:51 PM, Etsuro Fujita >> wrote: >>> >>> On 2017/01/05 21:11, Ashutosh Bapat wrote: > > > On 2017/01/03 17:28, Ashutosh Bapat wrote: >> >> Also, in this function, if fpinfo->tlist is already set, why do we >> want >> to >> build it again? > > IIUC, for a relation with use_remote_estimates we will deparse the query twice and will build the targetlist twice. > > >>> That's right. We could avoid the duplicate work the way you proposed, >>> but I >>> was thinking to leave that for another patch. Should we do that in this >>> patch? > > >> If you are agree that the change is needed, it's better to do it in >> this patch itself if we can, instead of a one liner patch. > > > While working on this, I noticed a weird case. Consider: > > postgres=# explain verbose select 1 from ft1 left join ft2 on (ft1.a = > ft2.a) inner join test on (true); >QUERY PLAN > - > Nested Loop (cost=100.00..103.06 rows=1 width=4) >Output: 1 >-> Foreign Scan (cost=100.00..102.04 rows=1 width=0) > Relations: (public.ft1) LEFT JOIN (public.ft2) > Remote SQL: SELECT NULL FROM (public.t1 r1 LEFT JOIN public.t2 r2 > ON (((r1.a = r2.a >-> Seq Scan on public.test (cost=0.00..1.01 rows=1 width=0) > Output: test.a, test.b > (7 rows) > > In this case the fpinfo->tlist of the foreign join is NIL, so whether or not > the tlist is already built cannot be discriminated by the fpinfo->tlist. We > might need another flag to show that the tlist has been built already. > Since this is an existing issue and we would need to give careful thought to > this, so I'd like to leave this for another patch. I think in that case, relation's targetlist will also be NIL or contain no Var node. It wouldn't be expensive to build it again and again. That's better than maintaining a new flag. This is just a suggestion, for an additional check, you might want to check rel->reltarget->exprs for NIL. But I think we don't need it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/05 21:38, Ashutosh Bapat wrote: On Thu, Jan 5, 2017 at 5:51 PM, Etsuro Fujitawrote: On 2017/01/05 21:11, Ashutosh Bapat wrote: On 2017/01/03 17:28, Ashutosh Bapat wrote: Also, in this function, if fpinfo->tlist is already set, why do we want to build it again? IIUC, for a relation with use_remote_estimates we will deparse the query twice and will build the targetlist twice. That's right. We could avoid the duplicate work the way you proposed, but I was thinking to leave that for another patch. Should we do that in this patch? If you are agree that the change is needed, it's better to do it in this patch itself if we can, instead of a one liner patch. While working on this, I noticed a weird case. Consider: postgres=# explain verbose select 1 from ft1 left join ft2 on (ft1.a = ft2.a) inner join test on (true); QUERY PLAN - Nested Loop (cost=100.00..103.06 rows=1 width=4) Output: 1 -> Foreign Scan (cost=100.00..102.04 rows=1 width=0) Relations: (public.ft1) LEFT JOIN (public.ft2) Remote SQL: SELECT NULL FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON (((r1.a = r2.a -> Seq Scan on public.test (cost=0.00..1.01 rows=1 width=0) Output: test.a, test.b (7 rows) In this case the fpinfo->tlist of the foreign join is NIL, so whether or not the tlist is already built cannot be discriminated by the fpinfo->tlist. We might need another flag to show that the tlist has been built already. Since this is an existing issue and we would need to give careful thought to this, so I'd like to leave this for another patch. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Jan 5, 2017 at 5:51 PM, Etsuro Fujitawrote: > On 2017/01/05 21:11, Ashutosh Bapat wrote: >> >> On Thu, Jan 5, 2017 at 5:14 PM, Etsuro Fujita >> wrote: >>> >>> On 2017/01/03 17:28, Ashutosh Bapat wrote: In build_subquery_tlists(), why don't we handle base relations? + if (foreignrel->reloptkind != RELOPT_JOINREL) + return; > > >>> The reason for that is we don't need to handle the baserel cases; the >>> tlist >>> for a base relation, if needed, would be created while recursing into a >>> join >>> relation that joins the base relation to other base/join relation. > > >> Right. Sorry, I misunderstood the code. May be a comment would help. > > > Will add the comment. > Also, in this function, if fpinfo->tlist is already set, why do we want to build it again? > > >>> When this function gets called, fpinfo->tlist isn't set for any base or >>> join >>> relation that needs to build the tlist, so we always need to build it for >>> each such relation. > > >> IIUC, for a relation with use_remote_estimates we will deparse the >> query twice and will build the targetlist twice. > > > That's right. We could avoid the duplicate work the way you proposed, but I > was thinking to leave that for another patch. Should we do that in this > patch? If you are agree that the change is needed, it's better to do it in this patch itself if we can, instead of a one liner patch. > In build_tlist_to_deparse(), if fpinfo->tlist for the given relation is set, we should just return it rather than constructing it again. > > >>> In that function we wouldn't have such cases for base or join relations >>> needing the tlist. > > >> Same explanation as above. > > > Will revise if it's better to do that in this patch. Thanks. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/05 21:11, Ashutosh Bapat wrote: On Thu, Jan 5, 2017 at 5:14 PM, Etsuro Fujitawrote: On 2017/01/03 17:28, Ashutosh Bapat wrote: In build_subquery_tlists(), why don't we handle base relations? + if (foreignrel->reloptkind != RELOPT_JOINREL) + return; The reason for that is we don't need to handle the baserel cases; the tlist for a base relation, if needed, would be created while recursing into a join relation that joins the base relation to other base/join relation. Right. Sorry, I misunderstood the code. May be a comment would help. Will add the comment. Also, in this function, if fpinfo->tlist is already set, why do we want to build it again? When this function gets called, fpinfo->tlist isn't set for any base or join relation that needs to build the tlist, so we always need to build it for each such relation. IIUC, for a relation with use_remote_estimates we will deparse the query twice and will build the targetlist twice. That's right. We could avoid the duplicate work the way you proposed, but I was thinking to leave that for another patch. Should we do that in this patch? In build_tlist_to_deparse(), if fpinfo->tlist for the given relation is set, we should just return it rather than constructing it again. In that function we wouldn't have such cases for base or join relations needing the tlist. Same explanation as above. Will revise if it's better to do that in this patch. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Jan 5, 2017 at 5:14 PM, Etsuro Fujitawrote: > On 2017/01/03 17:28, Ashutosh Bapat wrote: > > I wrote: >>> >>> I updated the patch a bit further: simplified the function name >>> (s/build_subquery_rel_tlists/build_subquery_tlists/), and revised >>> comments a >>> little bit. Attached is an updated version >>> (postgres-fdw-subquery-support-v14.patch). > > >> Few comments > > > Thanks for the comments! > >> In build_subquery_tlists(), why don't we handle base relations? >> + if (foreignrel->reloptkind != RELOPT_JOINREL) >> + return; > > > The reason for that is we don't need to handle the baserel cases; the tlist > for a base relation, if needed, would be created while recursing into a join > relation that joins the base relation to other base/join relation. Right. Sorry, I misunderstood the code. May be a comment would help. > >> Also, in this function, if fpinfo->tlist is already set, why do we want to >> build it again? > > > When this function gets called, fpinfo->tlist isn't set for any base or join > relation that needs to build the tlist, so we always need to build it for > each such relation. IIUC, for a relation with use_remote_estimates we will deparse the query twice and will build the targetlist twice. > >> In build_tlist_to_deparse(), if fpinfo->tlist for the given relation is >> set, we >> should just return it rather than constructing it again. > > > In that function we wouldn't have such cases for base or join relations > needing the tlist. Same explanation as above. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2017/01/03 17:28, Ashutosh Bapat wrote: I wrote: I updated the patch a bit further: simplified the function name (s/build_subquery_rel_tlists/build_subquery_tlists/), and revised comments a little bit. Attached is an updated version (postgres-fdw-subquery-support-v14.patch). Few comments Thanks for the comments! In build_subquery_tlists(), why don't we handle base relations? + if (foreignrel->reloptkind != RELOPT_JOINREL) + return; The reason for that is we don't need to handle the baserel cases; the tlist for a base relation, if needed, would be created while recursing into a join relation that joins the base relation to other base/join relation. Also, in this function, if fpinfo->tlist is already set, why do we want to build it again? When this function gets called, fpinfo->tlist isn't set for any base or join relation that needs to build the tlist, so we always need to build it for each such relation. In build_tlist_to_deparse(), if fpinfo->tlist for the given relation is set, we should just return it rather than constructing it again. In that function we wouldn't have such cases for base or join relations needing the tlist. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> > I updated the patch a bit further: simplified the function name > (s/build_subquery_rel_tlists/build_subquery_tlists/), and revised comments a > little bit. Attached is an updated version > (postgres-fdw-subquery-support-v14.patch). And I rebased another patch for > PHVs against that patch, which is also attached > (postgres-fdw-phv-pushdown-v14.patch). > Few comments In build_subquery_tlists(), why don't we handle base relations? + if (foreignrel->reloptkind != RELOPT_JOINREL) + return; Also, in this function, if fpinfo->tlist is already set, why do we want to build it again? In build_tlist_to_deparse(), if fpinfo->tlist for the given relation is set, we should just return it rather than constructing it again. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/08 21:08, Etsuro Fujita wrote: On 2016/12/07 20:23, Etsuro Fujita wrote: My proposal here would be a bit different from what you proposed; right before deparseSelectSql in deparseSelectStmtForRel, build the tlists for relations present in the given jointree that will be deparsed as subqueries, by (1) traversing the given jointree and (2) applying build_tlist_to_deparse to each relation to be deparsed as a subquery and saving the result in that relation's fpinfo. I think that would be implemented easily, and the overhead would be small. I implemented that to address your concern: * deparseRangeTblRef uses the tlist created as above, to build the SELECT clause of the subquery. (You said "Then in deparseRangeTblRef() assert that there's a tlist in fpinfo", but the tlist may be empty, so I didn't add any assertion to that function.) * get_relation_column_alias_ids uses tlist_member with the tlist. I also addressed the comments #1, #2 and #3 in [1]. For #1, I added "LIMIT 10" to the query. Attached is the new version of the patch. Other changes: * As discussed before, renamed grouped_tlist in fpinfo to "tlist" and used it to store the tlist created as above. * Also, renamed SS_REL_ALIAS_PREFIX to SUBQUERY_REL_ALIAS_PREFIX (Likewise for SS_COL_ALIAS_PREFIX). * Relocated some functions. * Revised comments a little bit. I updated the patch a bit further: simplified the function name (s/build_subquery_rel_tlists/build_subquery_tlists/), and revised comments a little bit. Attached is an updated version (postgres-fdw-subquery-support-v14.patch). And I rebased another patch for PHVs against that patch, which is also attached (postgres-fdw-phv-pushdown-v14.patch). Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SUBQUERY_REL_ALIAS_PREFIX "s" + #define SUBQUERY_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 159,165 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); --- 161,167 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,177 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, List **params_list); + static void appendSubqueryAlias(StringInfo buf, int relno, int ncols); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 175,180 static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); --- 180,194 static Node *deparseSortGroupClause(Index ref, List *tlist, deparse_expr_cxt *context); + /* + * Helper functions + */ + static void build_subquery_tlists(RelOptInfo *foreignrel); + static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, + int *relno, int *colno); + static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel, + int *relno, int *colno); + /* * Examine each qual clause in input_conds, and classify them into two groups, *** *** 861,867 build_tlist_to_deparse(RelOptInfo *foreignrel) * checking shippability, so just return that. */ if (foreignrel->reloptkind == RELOPT_UPPER_REL) ! return fpinfo->grouped_tlist; /* * We require columns specified in foreignrel->reltarget->exprs and those --- 875,881 * checking shippability, so just return that. */ if (foreignrel->reloptkind == RELOPT_UPPER_REL) ! return fpinfo->tlist; /* * We require
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/07 20:23, Etsuro Fujita wrote: On 2016/12/07 15:39, Ashutosh Bapat wrote: On 2016/11/22 18:28, Ashutosh Bapat wrote: I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. I wrote: This would probably work, but seems to me a bit complicated. Instead, I'd like to propose that we build the tlist for each relation being deparsed as a subquery in a given join tree, right before deparsing the SELECT clause in deparseSelectStmtForRel, if is_subquery is false and lower_subquery_rels isn't NULL, and store the tlist into the relation's fpinfo. That would allow us to build the tlist only when we need it, and to use tlist_member for the exact comparison. I think it would be much easier to implement that. IIRC, this is inline with my original proposal of creating tlists before deparsing anything. Along-with that I also proposed to bubble this array of tlist up the join hierarchy to avoid recursion [1] point #15, further elaborated in [2] [1] https://www.postgresql.org/message-id/ad449b25-66ee-4c06-568c-0eb2e1bef9f9%40lab.ntt.co.jp [2] https://www.postgresql.org/message-id/CAFjFpRcn7%3DDNOXm-PJ_jVDqAmghKVf6tApQC%2B_RgMZ8tOPExcA%40mail.gmail.com My proposal here would be a bit different from what you proposed; right before deparseSelectSql in deparseSelectStmtForRel, build the tlists for relations present in the given jointree that will be deparsed as subqueries, by (1) traversing the given jointree and (2) applying build_tlist_to_deparse to each relation to be deparsed as a subquery and saving the result in that relation's fpinfo. I think that would be implemented easily, and the overhead would be small. I implemented that to address your concern: * deparseRangeTblRef uses the tlist created as above, to build the SELECT clause of the subquery. (You said "Then in deparseRangeTblRef() assert that there's a tlist in fpinfo", but the tlist may be empty, so I didn't add any assertion to that function.) * get_relation_column_alias_ids uses tlist_member with the tlist. I also addressed the comments #1, #2 and #3 in [1]. For #1, I added "LIMIT 10" to the query. Attached is the new version of the patch. Other changes: * As discussed before, renamed grouped_tlist in fpinfo to "tlist" and used it to store the tlist created as above. * Also, renamed SS_REL_ALIAS_PREFIX to SUBQUERY_REL_ALIAS_PREFIX (Likewise for SS_COL_ALIAS_PREFIX). * Relocated some functions. * Revised comments a little bit. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAFjFpRfU4-gxqZ8ahoKM1ZtDJEThe3K8Lb_6beRKa8mmP%3Dv%3DfA%40mail.gmail.com *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SUBQUERY_REL_ALIAS_PREFIX "s" + #define SUBQUERY_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 159,165 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); --- 161,167 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,177 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/07 21:11, Ashutosh Bapat wrote: On Wed, Dec 7, 2016 at 4:51 PM, Etsuro Fujitawrote: On 2016/12/05 20:20, Ashutosh Bapat wrote: 3. Why is foreignrel variable changed to rel? -extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, -RelOptInfo *foreignrel, List *tlist, -List *remote_conds, List *pathkeys, -List **retrieved_attrs, List **params_list); +extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, +List *tlist, List *remote_conds, List *pathkeys, +bool is_subquery, List **retrieved_attrs, +List **params_list); I changed the name that way to match with the function definition in deparse.c. That's something good to do but it's unrelated to this patch. I have repeated this line several times in this thread :) OK. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Wed, Dec 7, 2016 at 4:51 PM, Etsuro Fujitawrote: > On 2016/12/05 20:20, Ashutosh Bapat wrote: >> >> On Mon, Nov 28, 2016 at 3:52 PM, Etsuro Fujita >> wrote: >>> >>> On 2016/11/24 21:45, Etsuro Fujita wrote: * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo and added a new member "is_subquery_rel" to fpinfo, as a flag on whether to deparse the relation as a subquery. > > >> Replacing make_outerrel_subquery/make_innerrel_subquery with >> is_subquery_rel >> seems to be a bad idea. Whether a relation needs to be deparsed as a >> subquery >> or not depends upon the relation which joins it with another relation. >> Take for >> example a case when a join ABC can pull up the conditions on AB, but ABD >> can >> not. In this case we will mark that AB in ABD needs to be deparsed as a >> subquery but will not mark so in ABC. So, if we choose to join ABCD as >> (ABC)D >> we don't need AB to be deparsed as a subquery. If we choose to join ABCD >> as >> (ABD)C, we will need AB to deparsed as a subquery. > > > This is not true; since (1) currently, a relation needs to be deparsed as a > subquery only when the relation is full-joined with any other relation, and > (2) the join ordering for full joins is preserved, we wouldn't have such an > example. (You might think we would have that when extending the patch to > handle PHVs, but the answer would be no, because the patch for PHVs would > determine whether a relation emitting PHVs needs to be deparsed as a > subquery, depending on the relation itself. See the patch for PHVs.) I like > is_subquery_rel more than make_outerrel_subquery/make_innerrel_subquery, > because it reduces the number of members in fpinfo. But the choice would be > arbitrary, so I wouldn't object to going back to > make_outerrel_subquery/make_innerrel_subquery. I think you are right. And even if we were to deparse a relation as subquery, when it shouldn't be, we will only loose a syntactic optimization. So, it shouldn't result in a bug. >> 3. Why is foreignrel variable changed to rel? >> -extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, >> -RelOptInfo *foreignrel, List *tlist, >> -List *remote_conds, List *pathkeys, >> -List **retrieved_attrs, List **params_list); >> +extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo >> *root, RelOptInfo *rel, >> +List *tlist, List *remote_conds, List *pathkeys, >> +bool is_subquery, List **retrieved_attrs, >> +List **params_list); > > > I changed the name that way to match with the function definition in > deparse.c. > That's something good to do but it's unrelated to this patch. I have repeated this line several times in this thread :) -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/07 15:39, Ashutosh Bapat wrote: On 2016/11/22 18:28, Ashutosh Bapat wrote: I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. I wrote: This would probably work, but seems to me a bit complicated. Instead, I'd like to propose that we build the tlist for each relation being deparsed as a subquery in a given join tree, right before deparsing the SELECT clause in deparseSelectStmtForRel, if is_subquery is false and lower_subquery_rels isn't NULL, and store the tlist into the relation's fpinfo. That would allow us to build the tlist only when we need it, and to use tlist_member for the exact comparison. I think it would be much easier to implement that. IIRC, this is inline with my original proposal of creating tlists before deparsing anything. Along-with that I also proposed to bubble this array of tlist up the join hierarchy to avoid recursion [1] point #15, further elaborated in [2] [1] https://www.postgresql.org/message-id/ad449b25-66ee-4c06-568c-0eb2e1bef9f9%40lab.ntt.co.jp [2] https://www.postgresql.org/message-id/CAFjFpRcn7%3DDNOXm-PJ_jVDqAmghKVf6tApQC%2B_RgMZ8tOPExcA%40mail.gmail.com We seem to be moving towards that solution, but as discussed we have left it to committer's judgement. My proposal here would be a bit different from what you proposed; right before deparseSelectSql in deparseSelectStmtForRel, build the tlists for relations present in the given jointree that will be deparsed as subqueries, by (1) traversing the given jointree and (2) applying build_tlist_to_deparse to each relation to be deparsed as a subquery and saving the result in that relation's fpinfo. I think that would be implemented easily, and the overhead would be small. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/05 20:20, Ashutosh Bapat wrote: On Mon, Nov 28, 2016 at 3:52 PM, Etsuro Fujitawrote: On 2016/11/24 21:45, Etsuro Fujita wrote: * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo and added a new member "is_subquery_rel" to fpinfo, as a flag on whether to deparse the relation as a subquery. Replacing make_outerrel_subquery/make_innerrel_subquery with is_subquery_rel seems to be a bad idea. Whether a relation needs to be deparsed as a subquery or not depends upon the relation which joins it with another relation. Take for example a case when a join ABC can pull up the conditions on AB, but ABD can not. In this case we will mark that AB in ABD needs to be deparsed as a subquery but will not mark so in ABC. So, if we choose to join ABCD as (ABC)D we don't need AB to be deparsed as a subquery. If we choose to join ABCD as (ABD)C, we will need AB to deparsed as a subquery. This is not true; since (1) currently, a relation needs to be deparsed as a subquery only when the relation is full-joined with any other relation, and (2) the join ordering for full joins is preserved, we wouldn't have such an example. (You might think we would have that when extending the patch to handle PHVs, but the answer would be no, because the patch for PHVs would determine whether a relation emitting PHVs needs to be deparsed as a subquery, depending on the relation itself. See the patch for PHVs.) I like is_subquery_rel more than make_outerrel_subquery/make_innerrel_subquery, because it reduces the number of members in fpinfo. But the choice would be arbitrary, so I wouldn't object to going back to make_outerrel_subquery/make_innerrel_subquery. Some more comments 1. The output of the following query +SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE); produces 21 rows all containing a "1". That's not quite good use of expected output space, given that all that the test tests is the empty targetlists are deparsed correctly. You might want to either just test EXPLAIN output or make "between 50 and 60" condition more stringent to reduce the number of rows output. Will do. 2. Got lost here. I guess, all you need to say is "test deparsing FOR UPDATE clause with subqueries.". Anyway, the sentence is very hard to read and needs simplification. +-- d. test deparsing the remote queries for simple foreign table scans in +-- an EPQ subplan of a foreign join in which the foreign tables are full +-- joined and in the remote join query the foreign tables are deparsed as +-- subqueries Will do. 3. Why is foreignrel variable changed to rel? -extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, -RelOptInfo *foreignrel, List *tlist, -List *remote_conds, List *pathkeys, -List **retrieved_attrs, List **params_list); +extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, +List *tlist, List *remote_conds, List *pathkeys, +bool is_subquery, List **retrieved_attrs, +List **params_list); I changed the name that way to match with the function definition in deparse.c. Thanks for the comments! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> > > Ashutosh proposed this to do the comparison: > > On 2016/11/22 18:28, Ashutosh Bapat wrote: >> >> I guess, the reason why you are doing it this way, is SELECT clause for >> the >> outermost query gets deparsed before FROM clause. For later we call >> deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT >> clause, we do not have tlist to build from. In that case, I guess, we have >> to >> build the tlist in get_subselect_alias_id() if it's not available and >> stick it >> in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist >> there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo >> and use it to build the SELECT clause of subquery. That way, we don't >> build >> tlist unless it's needed and also use the same tlist for all searches. >> Please >> use tlist_member() to search into the tlist. > > > This would probably work, but seems to me a bit complicated. Instead, I'd > like to propose that we build the tlist for each relation being deparsed as > a subquery in a given join tree, right before deparsing the SELECT clause in > deparseSelectStmtForRel, if is_subquery is false and lower_subquery_rels > isn't NULL, and store the tlist into the relation's fpinfo. That would > allow us to build the tlist only when we need it, and to use tlist_member > for the exact comparison. I think it would be much easier to implement > that. > IIRC, this is inline with my original proposal of creating tlists before deparsing anything. Along-with that I also proposed to bubble this array of tlist up the join hierarchy to avoid recursion [1] point #15, further elaborated in [2] [1] https://www.postgresql.org/message-id/ad449b25-66ee-4c06-568c-0eb2e1bef9f9%40lab.ntt.co.jp [2] https://www.postgresql.org/message-id/CAFjFpRcn7%3DDNOXm-PJ_jVDqAmghKVf6tApQC%2B_RgMZ8tOPExcA%40mail.gmail.com We seem to be moving towards that solution, but as discussed we have left it to committer's judgement. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Wed, Dec 7, 2016 at 1:57 AM, Robert Haaswrote: > On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapat > wrote: >> 4. I am still not happy with this change >> +/* >> + * Since (1) the expressions in foreignrel's reltarget doesn't >> contain >> + * any PHVs and (2) foreignrel's local_conds is empty, the tlist >> + * created by build_tlist_to_deparse must be one-to-one with the >> + * expressions. >> + */ >> +Assert(list_length(tlist) == >> list_length(foreignrel->reltarget->exprs)); >> the assertion only checks that the number of elements in both the lists are >> same but does not check whether those lists are same i.e. they contain the >> same >> elements in the same order. This equality is crucial to deparsing logic. If >> somehow build_tlist_to_deparse() breaks that assumption in future, we have no >> way to detect it, unless a regression test fails. > > If there's an easy way to do a more exact comparison, great. But if > we can't get an awesome Assert(), a helpful Assert() is still better > than a kick in the head. The assert is not a problem in itself, but the reason we have to add the assert. The problem is explained in [1], point #9. [1] https://www.postgresql.org/message-id/CAFjFpRfwoSsJr9418b2jA7g0nwagjZSWhPQPUmK9M6z5XSOAqQ%40mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/12/07 5:27, Robert Haas wrote: On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapatwrote: 4. I am still not happy with this change +/* + * Since (1) the expressions in foreignrel's reltarget doesn't contain + * any PHVs and (2) foreignrel's local_conds is empty, the tlist + * created by build_tlist_to_deparse must be one-to-one with the + * expressions. + */ +Assert(list_length(tlist) == list_length(foreignrel->reltarget->exprs)); the assertion only checks that the number of elements in both the lists are same but does not check whether those lists are same i.e. they contain the same elements in the same order. This equality is crucial to deparsing logic. If somehow build_tlist_to_deparse() breaks that assumption in future, we have no way to detect it, unless a regression test fails. If there's an easy way to do a more exact comparison, great. Ashutosh proposed this to do the comparison: On 2016/11/22 18:28, Ashutosh Bapat wrote: I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. This would probably work, but seems to me a bit complicated. Instead, I'd like to propose that we build the tlist for each relation being deparsed as a subquery in a given join tree, right before deparsing the SELECT clause in deparseSelectStmtForRel, if is_subquery is false and lower_subquery_rels isn't NULL, and store the tlist into the relation's fpinfo. That would allow us to build the tlist only when we need it, and to use tlist_member for the exact comparison. I think it would be much easier to implement that. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Dec 5, 2016 at 6:20 AM, Ashutosh Bapatwrote: > 4. I am still not happy with this change > +/* > + * Since (1) the expressions in foreignrel's reltarget doesn't > contain > + * any PHVs and (2) foreignrel's local_conds is empty, the tlist > + * created by build_tlist_to_deparse must be one-to-one with the > + * expressions. > + */ > +Assert(list_length(tlist) == > list_length(foreignrel->reltarget->exprs)); > the assertion only checks that the number of elements in both the lists are > same but does not check whether those lists are same i.e. they contain the > same > elements in the same order. This equality is crucial to deparsing logic. If > somehow build_tlist_to_deparse() breaks that assumption in future, we have no > way to detect it, unless a regression test fails. If there's an easy way to do a more exact comparison, great. But if we can't get an awesome Assert(), a helpful Assert() is still better than a kick in the head. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Nov 28, 2016 at 3:52 PM, Etsuro Fujitawrote: > On 2016/11/24 21:45, Etsuro Fujita wrote: >> >> On 2016/11/22 18:28, Ashutosh Bapat wrote: >>> >>> The comments should explain why is the assertion true. >>> +/* Shouldn't be NIL */ >>> +Assert(tlist != NIL); > > >> I noticed that I was wrong; in the Assertion the tlist can be empty. An >> example for such a case is: >> >> SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL >> JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE); >> >> In this example any columns of the relations ft4 and ft5 wouldn't be >> needed for the join or the final output, so both the tlists for the >> relations created in deparseRangeTblRef would be empty. Attached is an >> updated version fixing this bug. Changes are: >> >> * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo >> and added a new member "is_subquery_rel" to fpinfo, as a flag on whether >> to deparse the relation as a subquery. Replacing make_outerrel_subquery/make_innerrel_subquery with is_subquery_rel seems to be a bad idea. Whether a relation needs to be deparsed as a subquery or not depends upon the relation which joins it with another relation. Take for example a case when a join ABC can pull up the conditions on AB, but ABD can not. In this case we will mark that AB in ABD needs to be deparsed as a subquery but will not mark so in ABC. So, if we choose to join ABCD as (ABC)D we don't need AB to be deparsed as a subquery. If we choose to join ABCD as (ABD)C, we will need AB to deparsed as a subquery. Regression doesn't have a testcase like that hence the code seems to be working. Some more comments 1. The output of the following query +SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE); produces 21 rows all containing a "1". That's not quite good use of expected output space, given that all that the test tests is the empty targetlists are deparsed correctly. You might want to either just test EXPLAIN output or make "between 50 and 60" condition more stringent to reduce the number of rows output. 2. Got lost here. I guess, all you need to say is "test deparsing FOR UPDATE clause with subqueries.". Anyway, the sentence is very hard to read and needs simplification. +-- d. test deparsing the remote queries for simple foreign table scans in +-- an EPQ subplan of a foreign join in which the foreign tables are full +-- joined and in the remote join query the foreign tables are deparsed as +-- subqueries 3. Why is foreignrel variable changed to rel? -extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, -RelOptInfo *foreignrel, List *tlist, -List *remote_conds, List *pathkeys, -List **retrieved_attrs, List **params_list); +extern void deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, +List *tlist, List *remote_conds, List *pathkeys, +bool is_subquery, List **retrieved_attrs, +List **params_list); 4. I am still not happy with this change +/* + * Since (1) the expressions in foreignrel's reltarget doesn't contain + * any PHVs and (2) foreignrel's local_conds is empty, the tlist + * created by build_tlist_to_deparse must be one-to-one with the + * expressions. + */ +Assert(list_length(tlist) == list_length(foreignrel->reltarget->exprs)); the assertion only checks that the number of elements in both the lists are same but does not check whether those lists are same i.e. they contain the same elements in the same order. This equality is crucial to deparsing logic. If somehow build_tlist_to_deparse() breaks that assumption in future, we have no way to detect it, unless a regression test fails. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/11/23 20:28, Rushabh Lathia wrote: On Tue, Nov 22, 2016 at 6:24 PM, Etsuro Fujita> wrote: 1) -static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, +static void deparseExplicitTargetList(bool is_returning, + List *tlist, + List **retrieved_attrs, deparse_expr_cxt *context); Any particular reason of inserting new argument as 1st argunment? In general we add new flags at the end or before the out param for the existing function. No, I don't. I think the *context should be the last argument as in other functions in deparse.c. How about inserting that before the out param **retrieved_attrs, like this? static void deparseExplicitTargetList(List *tlist, bool is_returning, List **retrieved_attrs, deparse_expr_cxt *context); Yes, adding it before retrieved_attrs would be good. OK, will do. 5) make_explicit_returning_list() pull the var list from the returningList and build the targetentry for the returning list and at the end rewrites the fdw_scan_tlist. AFAIK, in case of DML - which is going to pushdown to the remote server ideally fdw_scan_tlist should be same as returning list, as final output for the query is query will be RETUNING list only. isn't that true? That would be true. But the fdw_scan_tlist passed from the core would contain junk columns inserted by the rewriter and planner work, such as CTID for the target table and whole-row Vars for other tables specified in the FROM clause of an UPDATE or the USING clause of a DELETE. So, I created the patch so that the pushed-down UPDATE/DELETE retrieves only the data needed for the RETURNING calculation from the remote server, not the whole fdw_scan_tlist data. Yes, I noticed that fdw_scan_tlist have CTID for the target table and whole-raw vars for other tables specified in the FROM clause of the DML. But I was thinking isn't it possible to create new fdw_scan_tlist once we found that DML is direct pushable to foreign server? I tried quickly doing that - but later its was throwing "variable not found in subplan target list" from set_foreignscan_references(). If yes, then why can't we directly replace the fdw_scan_tlist with the returning list, rather then make_explicit_returning_list()? I think we could do that if we modified the core (maybe the executor part). I'm not sure that's a good idea, though. Yes modifying core is not good idea. But just want to understand why you think that this need need to modify core? Sorry, I don't remember that. Will check. I'd like to move this to the next CF. Thank you for the comments! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/24 21:45, Etsuro Fujita wrote: On 2016/11/22 18:28, Ashutosh Bapat wrote: The comments should explain why is the assertion true. +/* Shouldn't be NIL */ +Assert(tlist != NIL); I noticed that I was wrong; in the Assertion the tlist can be empty. An example for such a case is: SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE); In this example any columns of the relations ft4 and ft5 wouldn't be needed for the join or the final output, so both the tlists for the relations created in deparseRangeTblRef would be empty. Attached is an updated version fixing this bug. Changes are: * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo and added a new member "is_subquery_rel" to fpinfo, as a flag on whether to deparse the relation as a subquery. * I modified deparseRangeTblRef, deparseSelectSql, and is_subquery_var to see the is_subquery_rel, not make_outerrel_subquery/make_innerrel_subquery. * I modified appendSubselectAlias to handle the case where ncols = 0 properly. * I renamed subquery_rels in fpinfo to lower_subquery_rels, to make it easy to distinguish this from is_subquery_rel clearly. * I added regression tests for that. I noticed that the above changes are not adequate; the updated patch breaks EXPLAIN outputs for EvalPlanQual subplans of foreign joins in which foreign tables are full joined and in the remote join query the foreign tables are deparsed as subqueries. Consider: postgres=# explain verbose select * from test, ((select * from ft1 where b between 1 and 3) t1 full join (select * from ft2 where b between 1 and 3) t2 on (t1.a = t2.a)) ss for update of test; QUERY PLA N --- LockRows (cost=100.00..287.33 rows=6780 width=94) Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid, ft1.*, ft2.* -> Nested Loop (cost=100.00..219.53 rows=6780 width=94) Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid, ft1.*, ft2.* -> Seq Scan on public.test (cost=0.00..32.60 rows=2260 width=14) Output: test.a, test.b, test.ctid -> Materialize (cost=100.00..102.19 rows=3 width=80) Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.* -> Foreign Scan (cost=100.00..102.17 rows=3 width=80) Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.* Relations: (public.ft1) FULL JOIN (public.ft2) Remote SQL: SELECT s5.c1, s5.c2, s5.c3, s6.c1, s6.c2, s6.c3 FROM ((SELECT a, b, ROW(a, b) FROM public.t1 WHERE ((b >= 1)) AND ((b <= 3))) s5(c1, c2, c3) FULL JOIN (SELECT a, b, ROW(a, b) FROM public.t2 WHERE ((b >= 1)) AND ((b <= 3))) s6(c1, c2, c3) ON (((s5.c1 = s6.c1 -> Hash Full Join (cost=201.14..202.29 rows=3 width=80) Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.* Hash Cond: (ft1.a = ft2.a) -> Foreign Scan on public.ft1 (cost=100.00..101.11 rows=3 width=40) Output: ft1.a, ft1.b, ft1.* Remote SQL: SELECT NULL FROM public.t1 WHERE ((b >= 1)) AND ((b <= 3)) -> Hash (cost=101.11..101.11 rows=3 width=40) Output: ft2.a, ft2.b, ft2.* -> Foreign Scan on public.ft2 (cost=100.00..101.11 rows=3 width=40) Output: ft2.a, ft2.b, ft2.* Remote SQL: SELECT NULL FROM public.t2 WHERE ((b >= 1)) AND ((b <= 3)) (23 rows) The SELECT clauses of the remote queries for the Foreign Scans in the EPQ subplan are deparsed incorrectly into "SELECT NULL". The reason for that is that since the foreign tables' fpinfo->is_subquery_rel has been set true at the time those clauses are deparsed (due to that the foreign tables have been deparsed as subqueries in the main remote join query), deparseSelectSql calls deparseExplicitTargetList with an empty tlist, to construct the clauses, so it deparses the tlist into "NULL". This is harmless because we don't use the remote queries for the Foreign Scans during an EPQ recheck, but that would be confusing for users, so I modified the patch a bit further to make the EXPLAIN output as-is. Attached is a new version of the patch. I also revised code and comments a bit, just for consistency. Also, I renamed appendSubselectAlias to appendSubqueryAlias, because we use the term "subquery" for other places. I will update another patch for PHVs on top of the
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/22 18:28, Ashutosh Bapat wrote: The comments should explain why is the assertion true. +/* Shouldn't be NIL */ +Assert(tlist != NIL); I noticed that I was wrong; in the Assertion the tlist can be empty. An example for such a case is: SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE); In this example any columns of the relations ft4 and ft5 wouldn't be needed for the join or the final output, so both the tlists for the relations created in deparseRangeTblRef would be empty. Attached is an updated version fixing this bug. Changes are: * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo and added a new member "is_subquery_rel" to fpinfo, as a flag on whether to deparse the relation as a subquery. * I modified deparseRangeTblRef, deparseSelectSql, and is_subquery_var to see the is_subquery_rel, not make_outerrel_subquery/make_innerrel_subquery. * I modified appendSubselectAlias to handle the case where ncols = 0 properly. * I renamed subquery_rels in fpinfo to lower_subquery_rels, to make it easy to distinguish this from is_subquery_rel clearly. * I added regression tests for that. The rest of the patch is the same as the previous version, but: +/* Should be same length */ +Assert(list_length(tlist) == list_length(foreignrel->reltarget->exprs)); I added comments explaining the reason. The name get_subselect_alias_id() seems to suggest that the function returns identifier for subselect alias, which isn't correct. It returns the alias identifiers for deparsing a Var node. But I guess, we have left this to the committer's judgement. I agree that the name isn't good, so I changed it to get_relation_column_alias_ids(). Let me know if it may be better. I also revised code and comments a bit, just for consistency. I will update another patch for PHVs on top of the attached one. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_REL_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,181 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, List **params_list); + static void appendSubselectAlias(StringInfo buf, int relno, int ncols); + static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel, + int *relno, int *colno); + static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, int *relno, + int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 990,1001 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) */ appendStringInfoString(buf, "SELECT "); if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL) ! { ! /* For a join relation use the input tlist */ deparseExplicitTargetList(tlist, retrieved_attrs, context); - } else { /* --- 999,1013 */ appendStringInfoString(buf, "SELECT "); + /* + * For a join relation or an upper relation, use deparseExplicitTargetList. + * Likewise, for a relation that is being deparsed as a subquery, use that + * function. Otherwise, use deparseTargetList. + */ if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL || ! fpinfo->is_subquery_rel) deparseExplicitTargetList(tlist, retrieved_attrs, context); else { /* *** *** 1155,1165 deparseLockingClause(deparse_expr_cxt *context) --- 1167,1185 StringInfo buf = context->buf; PlannerInfo *root = context->root; RelOptInfo *rel = context->scanrel; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; int relid = -1; while ((relid = bms_next_member(rel->relids, relid)) >= 0) { /* + * Ignore relation if it appears in a lower subquery. Locking clause + * for such a relation, if needed, is included in the subquery. + */ + if (bms_is_member(relid, fpinfo->lower_subquery_rels)) + continue; + + /* * Add FOR UPDATE/SHARE if
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/24 18:20, Ashutosh Bapat wrote: I wrote: You missed the point; the foreignrel->reltarget->exprs doesn't contain any PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be one-to-one with the foreignrel->reltarget->exprs. You wrote: It's guaranteed now, but can not be forever. This means anybody who supports PHV tomorrow, will have to "remember" to update this code as well. If s/he misses it, a bug will be introduced. That's the avenue for bug, I am talking about. I wrote: It *can* be guaranteed. See another patch for supporting evaluating PHVs remotely. We can't base assumptions in this patch on something in the other patch, esp. when that's not even reviewed once. PHV is just one case, subqueries involving aggregates is other and there are others. Let's discuss this together with the patch for PHVs. That was one of the reasons why I had merged the two patches into one. I'd like to leave enhancements such as subqueries involving aggregates for future work, though. But that's not really the point. The point is add_to_flat_tlist(pull_var_clause(rel->reltarget->exprs) can not be proved to be same as rel->reltarget->exprs in general. Really? As mentioned in comments for RelOptInfo in relation.h shown below, the rel->reltarget->exprs for a base/join relation only contains Vars and PHVs, so the two things would be proved to be one-to-one. (Note: we don't need to care about the appendrel-child-relation case because appendrel child relations wouldn't appear in the main join tree.) * reltarget - Default Path output tlist for this rel; normally contains * Var and PlaceHolderVar nodes for the values we need to * output from this relation. * List is in no particular order, but all rels of an * appendrel set must use corresponding orders. * NOTE: in an appendrel child relation, may contain * arbitrary expressions pulled up from a subquery! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> > build_tlist_to_depase() calls pull_var_nodes() before creating the tlist, whereas the code that searches does not do that. Code-wise those are not the same things. > > >>> You missed the point; the foreignrel->reltarget->exprs doesn't contain >>> any >>> PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to >>> be >>> one-to-one with the foreignrel->reltarget->exprs. > > >> It's guaranteed now, but can not be forever. This means anybody who >> supports PHV tomorrow, will have to "remember" to update this code as >> well. If s/he misses it, a bug will be introduced. That's the avenue >> for bug, I am talking about. > > > It *can* be guaranteed. See another patch for supporting evaluating PHVs > remotely. We can't base assumptions in this patch on something in the other patch, esp. when that's not even reviewed once. PHV is just one case, subqueries involving aggregates is other and there are others. But that's not really the point. The point is add_to_flat_tlist(pull_var_clause(rel->reltarget->exprs) can not be proved to be same as rel->reltarget->exprs in general. So, we should base our code on an assumption that can not be proved. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/24 17:39, Ashutosh Bapat wrote: On Thu, Nov 24, 2016 at 1:27 PM, Etsuro Fujitawrote: On 2016/11/24 16:46, Ashutosh Bapat wrote: table will be misleading as subquery can represent a join and corresponding alias would represent the join. Relation is better term. But the documentation uses the word "table" for a join relation. See 7.2.1.2. Table and Column Aliases: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES The definition there says "A temporary name can be given to tables and complex table references to be used for references to the derived table in the rest of the query. This is called a table alias.", please note the usage "complex table references". Within the code, we do not refer to a relation as table. So, the comment in this code should not refer to a relation as table either. OK, will keep using the term "relation". I wrote: I don't think so; in the current version, we essentially deparse from and search into the same object, ie, foreignrel->reltarget->exprs, since the tlist created by build_tlist_to_deparse is build from the foreignrel->reltarget->exprs. Also, since it is just created for deparsing the relation as a subquery in deparseRangeTblRef and isn't stored into fpinfo or anywhere alse, we would need no concern about creating such avenues. IMO I think adding comments would be enough for this. build_tlist_to_depase() calls pull_var_nodes() before creating the tlist, whereas the code that searches does not do that. Code-wise those are not the same things. You missed the point; the foreignrel->reltarget->exprs doesn't contain any PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be one-to-one with the foreignrel->reltarget->exprs. It's guaranteed now, but can not be forever. This means anybody who supports PHV tomorrow, will have to "remember" to update this code as well. If s/he misses it, a bug will be introduced. That's the avenue for bug, I am talking about. It *can* be guaranteed. See another patch for supporting evaluating PHVs remotely. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Nov 24, 2016 at 1:27 PM, Etsuro Fujitawrote: > On 2016/11/24 16:46, Ashutosh Bapat wrote: Sorry. I think the current version is better than previous one. The term "subselect alias" is confusing in the previous version. In the current version, "Get the relation and column alias for a given Var node," we need to add word "identifiers" like "Get the relation and column identifiers for a given Var node". > > > I wrote: >>> >>> OK, but one thing I'm concerned about is the term "relation alias" seems >>> a >>> bit confusing because we already used the term for the alias of the form >>> "rN". To avoid that, how about saying "table alias", not "relation >>> alias"? >>> (in which case, the comment would be something like "Get the table and >>> column identifiers for a given Var node".) > > >> table will be misleading as subquery can represent a join and >> corresponding alias would represent the join. Relation is better term. > > > But the documentation uses the word "table" for a join relation. See > 7.2.1.2. Table and Column Aliases: > https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES The definition there says "A temporary name can be given to tables and complex table references to be used for references to the derived table in the rest of the query. This is called a table alias.", please note the usage "complex table references". Within the code, we do not refer to a relation as table. So, the comment in this code should not refer to a relation as table either. > >>> I don't think so; in the current version, we essentially deparse from and >>> search into the same object, ie, foreignrel->reltarget->exprs, since the >>> tlist created by build_tlist_to_deparse is build from the >>> foreignrel->reltarget->exprs. Also, since it is just created for >>> deparsing >>> the relation as a subquery in deparseRangeTblRef and isn't stored into >>> fpinfo or anywhere alse, we would need no concern about creating such >>> avenues. IMO I think adding comments would be enough for this. > > >> build_tlist_to_depase() calls pull_var_nodes() before creating the >> tlist, whereas the code that searches does not do that. Code-wise >> those are not the same things. > > > You missed the point; the foreignrel->reltarget->exprs doesn't contain any > PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be > one-to-one with the foreignrel->reltarget->exprs. It's guaranteed now, but can not be forever. This means anybody who supports PHV tomorrow, will have to "remember" to update this code as well. If s/he misses it, a bug will be introduced. That's the avenue for bug, I am talking about. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/24 16:46, Ashutosh Bapat wrote: Sorry. I think the current version is better than previous one. The term "subselect alias" is confusing in the previous version. In the current version, "Get the relation and column alias for a given Var node," we need to add word "identifiers" like "Get the relation and column identifiers for a given Var node". I wrote: OK, but one thing I'm concerned about is the term "relation alias" seems a bit confusing because we already used the term for the alias of the form "rN". To avoid that, how about saying "table alias", not "relation alias"? (in which case, the comment would be something like "Get the table and column identifiers for a given Var node".) table will be misleading as subquery can represent a join and corresponding alias would represent the join. Relation is better term. But the documentation uses the word "table" for a join relation. See 7.2.1.2. Table and Column Aliases: https://www.postgresql.org/docs/9.6/static/queries-table-expressions.html#QUERIES-TABLE-ALIASES I don't think so; in the current version, we essentially deparse from and search into the same object, ie, foreignrel->reltarget->exprs, since the tlist created by build_tlist_to_deparse is build from the foreignrel->reltarget->exprs. Also, since it is just created for deparsing the relation as a subquery in deparseRangeTblRef and isn't stored into fpinfo or anywhere alse, we would need no concern about creating such avenues. IMO I think adding comments would be enough for this. build_tlist_to_depase() calls pull_var_nodes() before creating the tlist, whereas the code that searches does not do that. Code-wise those are not the same things. You missed the point; the foreignrel->reltarget->exprs doesn't contain any PHVs, so the tlist created by build_tlist_to_depase will be guaranteed to be one-to-one with the foreignrel->reltarget->exprs. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> >> Sorry. I think the current version is better than previous one. The >> term "subselect alias" is confusing in the previous version. In the >> current version, "Get the relation and column alias for a given Var >> node," we need to add word "identifiers" like "Get the relation and >> column identifiers for a given Var node". > > > OK, but one thing I'm concerned about is the term "relation alias" seems a > bit confusing because we already used the term for the alias of the form > "rN". To avoid that, how about saying "table alias", not "relation alias"? > (in which case, the comment would be something like "Get the table and > column identifiers for a given Var node".) table will be misleading as subquery can represent a join and corresponding alias would represent the join. Relation is better term. > > >> If we deparse from and search into different objects, that's not very >> maintainable code. Changes to any one of them require changes to the >> other, thus creating avenues for bugs. Even if slighly expensive, we >> should search into and deparse from the same targetlist. > > > I don't think so; in the current version, we essentially deparse from and > search into the same object, ie, foreignrel->reltarget->exprs, since the > tlist created by build_tlist_to_deparse is build from the > foreignrel->reltarget->exprs. Also, since it is just created for deparsing > the relation as a subquery in deparseRangeTblRef and isn't stored into > fpinfo or anywhere alse, we would need no concern about creating such > avenues. IMO I think adding comments would be enough for this. build_tlist_to_depase() calls pull_var_nodes() before creating the tlist, whereas the code that searches does not do that. Code-wise those are not the same things. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/23 0:30, Ashutosh Bapat wrote: You wrote: The comment below says "get the aliases", but what the function really returns is the identifiers used for creating aliases. Please correct the comment. +/* + * Get the relation and column alias for a given Var node, which belongs to + * input foreignrel. They are returned in *tabno and *colno respectively. + */ I wrote: Actually, this was rewritten into the above by *you*. The original comment I added was: + /* + * Get info about the subselect alias to given expression. + * + * The subselect table and column numbers are returned to *tabno and *colno, + * respectively. + */ I'd like to change the comment into something like the original one. Sorry. I think the current version is better than previous one. The term "subselect alias" is confusing in the previous version. In the current version, "Get the relation and column alias for a given Var node," we need to add word "identifiers" like "Get the relation and column identifiers for a given Var node". OK, but one thing I'm concerned about is the term "relation alias" seems a bit confusing because we already used the term for the alias of the form "rN". To avoid that, how about saying "table alias", not "relation alias"? (in which case, the comment would be something like "Get the table and column identifiers for a given Var node".) We discussed that we have to deparse and search from the same targetlist. And that the targetlist should be saved in fpinfo, the first time it gets created. But the patch seems to be searching in foreignrel->reltarget->exprs and deparsing from the tlist returned by add_to_flat_tlist(tlist, foreignrel->reltarget->exprs). +foreach(lc, foreignrel->reltarget->exprs) +{ +if (equal(lfirst(lc), (Node *) node)) +{ +*colno = i; +return; +} +i++; +} I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. That's right. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. I don't think that's a good idea because that would make the code unnecessarily complicated and inefficient. I think the direct search into the foreignrel->reltarget->exprs shown above would be better because that's more simple and efficient than what you proposed. Note that since (1) the foreignrel->reltarget->exprs doesn't contain any PHVs and (2) the local_conds is empty, the tlist created for the subquery by build_tlist_to_deparse is guaranteed to have one-to-one correspondence with the foreignrel->reltarget->exprs, so the direct search works well. If we deparse from and search into different objects, that's not very maintainable code. Changes to any one of them require changes to the other, thus creating avenues for bugs. Even if slighly expensive, we should search into and deparse from the same targetlist. I don't think so; in the current version, we essentially deparse from and search into the same object, ie, foreignrel->reltarget->exprs, since the tlist created by build_tlist_to_deparse is build from the foreignrel->reltarget->exprs. Also, since it is just created for deparsing the relation as a subquery in deparseRangeTblRef and isn't stored into fpinfo or anywhere alse, we would need no concern about creating such avenues. IMO I think adding comments would be enough for this. Anyway, I think this is an optional issue, so I'd like to leave this for the committer's judge. I think I have explained this before. My apologies for having misunderstood your words. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Tue, Nov 22, 2016 at 6:24 PM, Etsuro Fujitawrote: > Hi Rushabh, > > On 2016/11/22 19:05, Rushabh Lathia wrote: > >> I started reviewing the patch and here are few initial review points and >> questions for you. >> > > Thanks for the review! > > 1) >> -static void deparseExplicitTargetList(List *tlist, List >> **retrieved_attrs, >> +static void deparseExplicitTargetList(bool is_returning, >> + List *tlist, >> + List **retrieved_attrs, >>deparse_expr_cxt *context); >> >> Any particular reason of inserting new argument as 1st argunment? In >> general >> we add new flags at the end or before the out param for the existing >> function. >> > > No, I don't. I think the *context should be the last argument as in other > functions in deparse.c. How about inserting that before the out param > **retrieved_attrs, like this? > > static void > deparseExplicitTargetList(List *tlist, > bool is_returning, > List **retrieved_attrs, > deparse_expr_cxt *context); > > Yes, adding it before retrieved_attrs would be good. > 2) >> -static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, >> -RelOptInfo *joinrel, bool use_alias, List >> **params_list); >> +static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, >> RelOptInfo *foreignrel, >> + bool use_alias, Index target_rel, List >> **params_list); >> > > Going beyond 80 character length >> > > Will fix. > > 3) >> static void >> -deparseExplicitTargetList(List *tlist, List **retrieved_attrs, >> +deparseExplicitTargetList(bool is_returning, >> + List *tlist, >> + List **retrieved_attrs, >>deparse_expr_cxt *context) >> >> This looks bit wired to me - as comment for the >> deparseExplicitTargetList() >> function name and comments says different then what its trying to do when >> is_returning flag is passed. Either function comment need to be change or >> may be separate function fo deparse returning list for join relation? >> >> Is it possible to teach deparseReturningList() to deparse the returning >> list for the joinrel? >> > > I don't think it's possible to do that because deparseReturningList uses > deparseTargetList internally, which only allows us to emit a target list > for a baserel. For example, deparseReturningList can't handle a returning > list that contains join outputs like this: > > postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a > returning ft1.*, ft2.*; >QUERY PLAN > > > Delete on public.ft1 (cost=100.00..102.06 rows=1 width=46) >Output: ft1.a, ft1.b, ft2.a, ft2.b >-> Foreign Delete (cost=100.00..102.06 rows=1 width=46) > Remote SQL: DELETE FROM public.t1 r1 USING public.t2 r2 WHERE > ((r1.a = r2.a)) RETURNING r1.a, r1.b, r2.a, r2.b > (4 rows) > > I'd like to change the comment for deparseExplicitTargetList. > > 4) >> >> +if (foreignrel->reloptkind == RELOPT_JOINREL) >> +{ >> +List *rlist = NIL; >> + >> +/* Create a RETURNING list */ >> +rlist = make_explicit_returning_list(rtindex, rel, >> + returningList, >> + fdw_scan_tlist); >> + >> +deparseExplicitReturningList(rlist, retrieved_attrs, ); >> +} >> +else >> +deparseReturningList(buf, root, rtindex, rel, false, >> + returningList, retrieved_attrs); >> >> The code will be more centralized if we add the logic of extracting >> returninglist >> for JOINREL inside deparseReturningList() only, isn't it? >> > > You are right. Will do. > > 5) make_explicit_returning_list() pull the var list from the >> returningList and >> build the targetentry for the returning list and at the end rewrites the >> fdw_scan_tlist. >> >> AFAIK, in case of DML - which is going to pushdown to the remote server >> ideally fdw_scan_tlist should be same as returning list, as final output >> for the query is query will be RETUNING list only. isn't that true? >> > > That would be true. But the fdw_scan_tlist passed from the core would > contain junk columns inserted by the rewriter and planner work, such as > CTID for the target table and whole-row Vars for other tables specified in > the FROM clause of an UPDATE or the USING clause of a DELETE. So, I > created the patch so that the pushed-down UPDATE/DELETE retrieves only the > data needed for the RETURNING calculation from the remote server, not the > whole fdw_scan_tlist data. Yes, I noticed that fdw_scan_tlist have CTID for the target table and whole-raw
Re: [HACKERS] Push down more full joins in postgres_fdw
> > >> There is a already a function to build targetlist for a given relation >> build_tlist_to_deparse(), which does the same thing as this code for a >> join or >> base relation and when there are no local conditions. Why don't we use >> that >> function instead of duplicating that logic? If tomorrow, the logic to >> build the >> targetlist changes, we will need to duplicate those changes. We should >> avoid >> that. >> +/* Build a tlist from the subquery. */ >> +tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs); > > > Will modify the patch to use build_tlist_to_deparse. Actually, the early > versions of the patch used that function, but it looks like I changed not to > use that function, as I misunderstood about your comments on this part at > some point. Sorry for that. > >> The comment below says "get the aliases", but what the function really >> returns >> is the identifiers used for creating aliases. Please correct the comment. >> +/* >> + * Get the relation and column alias for a given Var node, which belongs >> to >> + * input foreignrel. They are returned in *tabno and *colno respectively. >> + */ > > > Actually, this was rewritten into the above by *you*. The original comment > I added was: > > + /* > + * Get info about the subselect alias to given expression. > + * > + * The subselect table and column numbers are returned to *tabno and > *colno, > + * respectively. > + */ > > I'd like to change the comment into something like the original one. Sorry. I think the current version is better than previous one. The term "subselect alias" is confusing in the previous version. In the current version, "Get the relation and column alias for a given Var node," we need to add word "identifiers" like "Get the relation and column identifiers for a given Var node". > >> We discussed that we have to deparse and search from the same targetlist. >> And >> that the targetlist should be saved in fpinfo, the first time it gets >> created. >> But the patch seems to be searching in foreignrel->reltarget->exprs and >> deparsing from the tlist returned by add_to_flat_tlist(tlist, >> foreignrel->reltarget->exprs). >> +foreach(lc, foreignrel->reltarget->exprs) >> +{ >> +if (equal(lfirst(lc), (Node *) node)) >> +{ >> +*colno = i; >> +return; >> +} >> +i++; >> +} >> I guess, the reason why you are doing it this way, is SELECT clause for >> the >> outermost query gets deparsed before FROM clause. For later we call >> deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT >> clause, we do not have tlist to build from. > > > That's right. > >> In that case, I guess, we have to >> build the tlist in get_subselect_alias_id() if it's not available and >> stick it >> in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist >> there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo >> and use it to build the SELECT clause of subquery. That way, we don't >> build >> tlist unless it's needed and also use the same tlist for all searches. >> Please >> use tlist_member() to search into the tlist. > > > I don't think that's a good idea because that would make the code > unnecessarily complicated and inefficient. I think the direct search into > the foreignrel->reltarget->exprs shown above would be better because that's > more simple and efficient than what you proposed. Note that since (1) the > foreignrel->reltarget->exprs doesn't contain any PHVs and (2) the > local_conds is empty, the tlist created for the subquery by > build_tlist_to_deparse is guaranteed to have one-to-one correspondence with > the foreignrel->reltarget->exprs, so the direct search works well. If we deparse from and search into different objects, that's not very maintainable code. Changes to any one of them require changes to the other, thus creating avenues for bugs. Even if slighly expensive, we should search into and deparse from the same targetlist. I think I have explained this before. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
Hi Rushabh, On 2016/11/22 19:05, Rushabh Lathia wrote: I started reviewing the patch and here are few initial review points and questions for you. Thanks for the review! 1) -static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, +static void deparseExplicitTargetList(bool is_returning, + List *tlist, + List **retrieved_attrs, deparse_expr_cxt *context); Any particular reason of inserting new argument as 1st argunment? In general we add new flags at the end or before the out param for the existing function. No, I don't. I think the *context should be the last argument as in other functions in deparse.c. How about inserting that before the out param **retrieved_attrs, like this? static void deparseExplicitTargetList(List *tlist, bool is_returning, List **retrieved_attrs, deparse_expr_cxt *context); 2) -static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, -RelOptInfo *joinrel, bool use_alias, List **params_list); +static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool use_alias, Index target_rel, List **params_list); Going beyond 80 character length Will fix. 3) static void -deparseExplicitTargetList(List *tlist, List **retrieved_attrs, +deparseExplicitTargetList(bool is_returning, + List *tlist, + List **retrieved_attrs, deparse_expr_cxt *context) This looks bit wired to me - as comment for the deparseExplicitTargetList() function name and comments says different then what its trying to do when is_returning flag is passed. Either function comment need to be change or may be separate function fo deparse returning list for join relation? Is it possible to teach deparseReturningList() to deparse the returning list for the joinrel? I don't think it's possible to do that because deparseReturningList uses deparseTargetList internally, which only allows us to emit a target list for a baserel. For example, deparseReturningList can't handle a returning list that contains join outputs like this: postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a returning ft1.*, ft2.*; QUERY PLAN Delete on public.ft1 (cost=100.00..102.06 rows=1 width=46) Output: ft1.a, ft1.b, ft2.a, ft2.b -> Foreign Delete (cost=100.00..102.06 rows=1 width=46) Remote SQL: DELETE FROM public.t1 r1 USING public.t2 r2 WHERE ((r1.a = r2.a)) RETURNING r1.a, r1.b, r2.a, r2.b (4 rows) I'd like to change the comment for deparseExplicitTargetList. 4) +if (foreignrel->reloptkind == RELOPT_JOINREL) +{ +List *rlist = NIL; + +/* Create a RETURNING list */ +rlist = make_explicit_returning_list(rtindex, rel, + returningList, + fdw_scan_tlist); + +deparseExplicitReturningList(rlist, retrieved_attrs, ); +} +else +deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); The code will be more centralized if we add the logic of extracting returninglist for JOINREL inside deparseReturningList() only, isn't it? You are right. Will do. 5) make_explicit_returning_list() pull the var list from the returningList and build the targetentry for the returning list and at the end rewrites the fdw_scan_tlist. AFAIK, in case of DML - which is going to pushdown to the remote server ideally fdw_scan_tlist should be same as returning list, as final output for the query is query will be RETUNING list only. isn't that true? That would be true. But the fdw_scan_tlist passed from the core would contain junk columns inserted by the rewriter and planner work, such as CTID for the target table and whole-row Vars for other tables specified in the FROM clause of an UPDATE or the USING clause of a DELETE. So, I created the patch so that the pushed-down UPDATE/DELETE retrieves only the data needed for the RETURNING calculation from the remote server, not the whole fdw_scan_tlist data. If yes, then why can't we directly replace the fdw_scan_tlist with the returning list, rather then make_explicit_returning_list()? I think we could do that if we modified the core (maybe the executor part). I'm not sure that's a good idea, though. Also I think rewriting the fdw_scan_tlist should happen into postgres_fdw.c - rather then deparse.c. Will try. 6) Test coverage for the returning list is missing. Will add. Best regards, Etsuro Fujita -- Sent via pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/22 18:28, Ashutosh Bapat wrote: The comments should explain why is the assertion true. +/* Shouldn't be NIL */ +Assert(tlist != NIL); +/* Should be same length */ +Assert(list_length(tlist) == list_length(foreignrel->reltarget->exprs)); Will revise. OK, I'd like to propose referencing to foreignrel->reltarget->exprs directly in deparseRangeTblRef and get_subselect_alias_id, then, which is the same as what I proposed in the first version of the patch, but I'd also like to change deparseRangeTblRef to use add_to_flat_tlist, not make_tlist_from_pathtarget, to create a tlist of the subquery, as you proposed. There is a already a function to build targetlist for a given relation build_tlist_to_deparse(), which does the same thing as this code for a join or base relation and when there are no local conditions. Why don't we use that function instead of duplicating that logic? If tomorrow, the logic to build the targetlist changes, we will need to duplicate those changes. We should avoid that. +/* Build a tlist from the subquery. */ +tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs); Will modify the patch to use build_tlist_to_deparse. Actually, the early versions of the patch used that function, but it looks like I changed not to use that function, as I misunderstood about your comments on this part at some point. Sorry for that. The comment below says "get the aliases", but what the function really returns is the identifiers used for creating aliases. Please correct the comment. +/* + * Get the relation and column alias for a given Var node, which belongs to + * input foreignrel. They are returned in *tabno and *colno respectively. + */ Actually, this was rewritten into the above by *you*. The original comment I added was: + /* + * Get info about the subselect alias to given expression. + * + * The subselect table and column numbers are returned to *tabno and *colno, + * respectively. + */ I'd like to change the comment into something like the original one. We discussed that we have to deparse and search from the same targetlist. And that the targetlist should be saved in fpinfo, the first time it gets created. But the patch seems to be searching in foreignrel->reltarget->exprs and deparsing from the tlist returned by add_to_flat_tlist(tlist, foreignrel->reltarget->exprs). +foreach(lc, foreignrel->reltarget->exprs) +{ +if (equal(lfirst(lc), (Node *) node)) +{ +*colno = i; +return; +} +i++; +} I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. That's right. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. I don't think that's a good idea because that would make the code unnecessarily complicated and inefficient. I think the direct search into the foreignrel->reltarget->exprs shown above would be better because that's more simple and efficient than what you proposed. Note that since (1) the foreignrel->reltarget->exprs doesn't contain any PHVs and (2) the local_conds is empty, the tlist created for the subquery by build_tlist_to_deparse is guaranteed to have one-to-one correspondence with the foreignrel->reltarget->exprs, so the direct search works well. The name get_subselect_alias_id() seems to suggest that the function returns identifier for subselect alias, which isn't correct. It returns the alias identifiers for deparsing a Var node. But I guess, we have left this to the committer's judgement. Fine with me. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
I started reviewing the patch and here are few initial review points and questions for you. 1) -static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, +static void deparseExplicitTargetList(bool is_returning, + List *tlist, + List **retrieved_attrs, deparse_expr_cxt *context); Any particular reason of inserting new argument as 1st argunment? In general we add new flags at the end or before the out param for the existing function. 2) -static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, -RelOptInfo *joinrel, bool use_alias, List **params_list); +static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool use_alias, Index target_rel, List **params_list); Going beyond 80 character length 3) static void -deparseExplicitTargetList(List *tlist, List **retrieved_attrs, +deparseExplicitTargetList(bool is_returning, + List *tlist, + List **retrieved_attrs, deparse_expr_cxt *context) This looks bit wired to me - as comment for the deparseExplicitTargetList() function name and comments says different then what its trying to do when is_returning flag is passed. Either function comment need to be change or may be separate function fo deparse returning list for join relation? Is it possible to teach deparseReturningList() to deparse the returning list for the joinrel? 4) +if (foreignrel->reloptkind == RELOPT_JOINREL) +{ +List *rlist = NIL; + +/* Create a RETURNING list */ +rlist = make_explicit_returning_list(rtindex, rel, + returningList, + fdw_scan_tlist); + +deparseExplicitReturningList(rlist, retrieved_attrs, ); +} +else +deparseReturningList(buf, root, rtindex, rel, false, + returningList, retrieved_attrs); The code will be more centralized if we add the logic of extracting returninglist for JOINREL inside deparseReturningList() only, isn't it? 5) make_explicit_returning_list() pull the var list from the returningList and build the targetentry for the returning list and at the end rewrites the fdw_scan_tlist. AFAIK, in case of DML - which is going to pushdown to the remote server ideally fdw_scan_tlist should be same as returning list, as final output for the query is query will be RETUNING list only. isn't that true? If yes, then why can't we directly replace the fdw_scan_tlist with the returning list, rather then make_explicit_returning_list()? Also I think rewriting the fdw_scan_tlist should happen into postgres_fdw.c - rather then deparse.c. 6) Test coverage for the returning list is missing. On Fri, Nov 18, 2016 at 8:56 AM, Etsuro Fujitawrote: > On 2016/11/16 16:38, Etsuro Fujita wrote: > >> On 2016/11/16 13:10, Ashutosh Bapat wrote: >> >>> I don't see any reason why DML/UPDATE pushdown should depend upon >>> subquery deparsing or least PHV patch. Combined together they can help >>> in more cases, but without those patches, we will be able to push-down >>> more stuff. Probably, we should just restrict push-down only for the >>> cases when above patches are not needed. That makes reviews easy. Once >>> those patches get committed, we may add more functionality depending >>> upon the status of this patch. Does that make sense? >>> >> > OK, I'll extract from the patch the minimal part that wouldn't depend on >> the two patches. >> > > Here is a patch for that. Todo items are: (1) add more comments and (2) > add more regression tests. I'll do that in the next version. > > Best regards, > Etsuro Fujita > -- Rushabh Lathia
Re: [HACKERS] Push down more full joins in postgres_fdw
The comments should explain why is the assertion true. +/* Shouldn't be NIL */ +Assert(tlist != NIL); +/* Should be same length */ +Assert(list_length(tlist) == list_length(foreignrel->reltarget->exprs)); > > OK, I'd like to propose referencing to foreignrel->reltarget->exprs directly > in deparseRangeTblRef and get_subselect_alias_id, then, which is the same as > what I proposed in the first version of the patch, but I'd also like to > change deparseRangeTblRef to use add_to_flat_tlist, not > make_tlist_from_pathtarget, to create a tlist of the subquery, as you > proposed. There is a already a function to build targetlist for a given relation build_tlist_to_deparse(), which does the same thing as this code for a join or base relation and when there are no local conditions. Why don't we use that function instead of duplicating that logic? If tomorrow, the logic to build the targetlist changes, we will need to duplicate those changes. We should avoid that. +/* Build a tlist from the subquery. */ +tlist = add_to_flat_tlist(tlist, foreignrel->reltarget->exprs); The comment below says "get the aliases", but what the function really returns is the identifiers used for creating aliases. Please correct the comment. +/* + * Get the relation and column alias for a given Var node, which belongs to + * input foreignrel. They are returned in *tabno and *colno respectively. + */ We discussed that we have to deparse and search from the same targetlist. And that the targetlist should be saved in fpinfo, the first time it gets created. But the patch seems to be searching in foreignrel->reltarget->exprs and deparsing from the tlist returned by add_to_flat_tlist(tlist, foreignrel->reltarget->exprs). +foreach(lc, foreignrel->reltarget->exprs) +{ +if (equal(lfirst(lc), (Node *) node)) +{ +*colno = i; +return; +} +i++; +} I guess, the reason why you are doing it this way, is SELECT clause for the outermost query gets deparsed before FROM clause. For later we call deparseRangeTblRef(), which builds the tlist. So, while deparsing SELECT clause, we do not have tlist to build from. In that case, I guess, we have to build the tlist in get_subselect_alias_id() if it's not available and stick it in fpinfo. Subsequent calls to get_subselect_alias_id() should find tlist there. Then in deparseRangeTblRef() assert that there's a tlist in fpinfo and use it to build the SELECT clause of subquery. That way, we don't build tlist unless it's needed and also use the same tlist for all searches. Please use tlist_member() to search into the tlist. The name get_subselect_alias_id() seems to suggest that the function returns identifier for subselect alias, which isn't correct. It returns the alias identifiers for deparsing a Var node. But I guess, we have left this to the committer's judgement. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/21 22:02, Ashutosh Bapat wrote: You wrote: Instead we should calculate tlist, the first time we need it and then add it to the fpinfo. I wrote: Having said that, I agree on that point. I'd like to propose (1) adding a new member to fpinfo, to store a list of output Vars from the subquery, and (2) creating a tlist from it in deparseRangeTblRef, then, which would allow us to calculate the tlist only when we need it. The member added to fpinfo would be also useful to address the comments on the DML/UPDATE pushdown patch. See the attached patch in [1]. I named the member a bit differently in that patch, though. Again the list of Vars will be wasted if we don't choose that path for final planning. So, I don't see the point of adding list of Vars there. If you think that the member is useful for DML/UDPATE pushdown, you may want to add it in the other patch. OK, I'd like to propose referencing to foreignrel->reltarget->exprs directly in deparseRangeTblRef and get_subselect_alias_id, then, which is the same as what I proposed in the first version of the patch, but I'd also like to change deparseRangeTblRef to use add_to_flat_tlist, not make_tlist_from_pathtarget, to create a tlist of the subquery, as you proposed. You modified the comments I added to deparseLockingClause into this: /* +* Ignore relation if it appears in a lower subquery. Locking clause +* for such a relation is included in the subquery. +*/ I don't think the second sentence is 100% correct because a locking clause isn't always required for such a relation, so I modified the sentence a bit. I guess, "if required" part was implicit in construct "such a relation". Your version seems to make it explicit. I am fine with it. OK, let's leave that for the committer's judge. Please find attached an updated version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,182 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, bool make_subquery, + List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void get_subselect_alias_id(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); + static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, int *tabno, + int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 990,1001 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) */ appendStringInfoString(buf, "SELECT "); if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL) ! { ! /* For a join relation use the input tlist */ deparseExplicitTargetList(tlist, retrieved_attrs, context); - } else { /* --- 1000,1015 */ appendStringInfoString(buf, "SELECT "); + /* + * For a join relation or an upper relation, use deparseExplicitTargetList. + * Likewise, for a base relation that is being deparsed as a subquery, in + * which case the caller would have passed non-NIL tlist, use that + * function. Otherwise, use deparseTargetList. + */ if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL || ! tlist != NIL) deparseExplicitTargetList(tlist, retrieved_attrs, context); else { /* *** *** 1155,1165 deparseLockingClause(deparse_expr_cxt *context) --- 1169,1187 StringInfo buf = context->buf; PlannerInfo *root = context->root; RelOptInfo *rel = context->scanrel; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; int relid = -1; while ((relid = bms_next_member(rel->relids, relid)) >= 0) { /* + * Ignore relation if it appears in a lower subquery. Locking clause + * for such a relation, if needed, is included in the subquery. + */ + if (bms_is_member(relid, fpinfo->subquery_rels)) + continue; + + /* * Add FOR UPDATE/SHARE if appropriate. We apply
Re: [HACKERS] Push down more full joins in postgres_fdw
> > Yeah, I modified the patch so, as I thought that would be consistent with > the aggregate pushdown patch. aggregate pushdown needs the tlist to judge the shippability of targetlist. For joins that's not required, so we should defer, if we can. > >>> Instead we should calculate tlist, the >>> first time we need it and then add it to the fpinfo. > > > Having said that, I agree on that point. I'd like to propose (1) adding a > new member to fpinfo, to store a list of output Vars from the subquery, and > (2) creating a tlist from it in deparseRangeTblRef, then, which would allow > us to calculate the tlist only when we need it. The member added to fpinfo > would be also useful to address the comments on the DML/UPDATE pushdown > patch. See the attached patch in [1]. I named the member a bit differently > in that patch, though. Again the list of Vars will be wasted if we don't choose that path for final planning. So, I don't see the point of adding list of Vars there. It looks like we are replacing one list with the other when none of those are useful, if the path doesn't get chosen for the final plan. If you think that the member is useful for DML/UDPATE pushdown, you may want to add it in the other patch. > > You modified the comments I added to deparseLockingClause into this: > > /* > +* Ignore relation if it appears in a lower subquery. Locking clause > +* for such a relation is included in the subquery. > +*/ > > I don't think the second sentence is 100% correct because a locking clause > isn't always required for such a relation, so I modified the sentence a bit. > I guess, "if required" part was implicit in construct "such a relation". Your version seems to make it explicit. I am fine with it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/19 0:16, Ashutosh Bapat wrote: Also another point I guess, this note doesn't add much value in the given context. Probably we should remove it. +* Note: the tlist would have one-to-one correspondence with the +* joining relation's reltarget->exprs because (1) the above test +* on PHVs guarantees that the reltarget->exprs doesn't contain +* any PHVs and (2) the joining relation's local_conds is NIL. +* This allows us to search the targetlist entry matching a given +* Var node from the tlist in get_subselect_alias_id. OK, I removed. On Fri, Nov 18, 2016 at 5:10 PM, Ashutosh Bapatwrote: I wrote: /* * For a join relation or an upper relation, use deparseExplicitTargetList. * Likewise, for a base relation that is being deparsed as a subquery, in * which case the caller would have passed tlist that is non-NIL, use that * function. Otherwise, use deparseTargetList. */ This looks correct. I have modified it to make it simple in the given patch. But, I think we shouldn't refer to a function e.g. deparseExplicitTargetlist() in the comment. Instead we should describe the intent e.g. "deparse SELECT clause from the given targetlist" or "deparse SELECT clause from attr_needed". My taste would be to refer to those functions, because ISTM that makes the explanation more simple and direct. So, I'd like to leave that for the committer's judge. I wrote: Done. I modified the patch as proposed; create the tlist by build_tlist_to_deparse in foreign_join_ok, if needed, and search the tlist by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo to save the tlist created in foreign_join_ok. You wrote: Instead of adding a new member, you might want to reuse grouped_tlist by renaming it. Done. Right now, we are calculating tlist whether or not the ForeignPath emerges as the cheapest path. Yeah, I modified the patch so, as I thought that would be consistent with the aggregate pushdown patch. Instead we should calculate tlist, the first time we need it and then add it to the fpinfo. Having said that, I agree on that point. I'd like to propose (1) adding a new member to fpinfo, to store a list of output Vars from the subquery, and (2) creating a tlist from it in deparseRangeTblRef, then, which would allow us to calculate the tlist only when we need it. The member added to fpinfo would be also useful to address the comments on the DML/UPDATE pushdown patch. See the attached patch in [1]. I named the member a bit differently in that patch, though. You modified the comments I added to deparseLockingClause into this: /* +* Ignore relation if it appears in a lower subquery. Locking clause +* for such a relation is included in the subquery. +*/ I don't think the second sentence is 100% correct because a locking clause isn't always required for such a relation, so I modified the sentence a bit. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/38245b84-fabf-0899-1b24-8f94cdc5900c%40lab.ntt.co.jp *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,182 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, bool make_subquery, + List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void get_subselect_alias_id(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); + static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, int *tabno, + int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 990,1001 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) */ appendStringInfoString(buf, "SELECT "); if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL) ! { ! /* For a join relation use the input tlist */ deparseExplicitTargetList(tlist,
Re: [HACKERS] Push down more full joins in postgres_fdw
Also another point I guess, this note doesn't add much value in the given context. Probably we should remove it. +* Note: the tlist would have one-to-one correspondence with the +* joining relation's reltarget->exprs because (1) the above test +* on PHVs guarantees that the reltarget->exprs doesn't contain +* any PHVs and (2) the joining relation's local_conds is NIL. +* This allows us to search the targetlist entry matching a given +* Var node from the tlist in get_subselect_alias_id. On Fri, Nov 18, 2016 at 5:10 PM, Ashutosh Bapatwrote: >> >> /* >> * For a join relation or an upper relation, use >> deparseExplicitTargetList. >> * Likewise, for a base relation that is being deparsed as a subquery, >> in >> * which case the caller would have passed tlist that is non-NIL, use >> that >> * function. Otherwise, use deparseTargetList. >> */ > > This looks correct. I have modified it to make it simple in the given > patch. But, I think we shouldn't refer to a function e.g. > deparseExplicitTargetlist() in the comment. Instead we should describe > the intent e.g. "deparse SELECT clause from the given targetlist" or > "deparse SELECT clause from attr_needed". > >> (3) I don't think we need this in isSubqueryExpr, so I removed it from the patch: + /* Keep compiler happy. */ + return false; >> >> >>> Doesn't that cause compiler warning, saying "non-void function >>> returning nothing" or something like that. Actually, the "if >>> (bms_is_member(node->varno, outerrel->relids))" ends with a "return" >>> always. Hence we don't need to encapsulate the code in "else" block in >>> else { }. It could be taken outside. >> >> >> Yeah, I think so too, but I like the "if () { } else { }" coding. That >> coding can be found in other places in core, eg, >> operator_same_subexprs_lookup. > > OK. > > >> Done. I modified the patch as proposed; create the tlist by build_tlist_to_deparse in foreign_join_ok, if needed, and search the tlist by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo to save the tlist created in foreign_join_ok. >> >> >>> Instead of adding a new member, you might want to reuse grouped_tlist >>> by renaming it. >> >> >> Done. > > Right now, we are calculating tlist whether or not the ForeignPath > emerges as the cheapest path. Instead we should calculate tlist, the > first time we need it and then add it to the fpinfo. > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> > /* > * For a join relation or an upper relation, use > deparseExplicitTargetList. > * Likewise, for a base relation that is being deparsed as a subquery, > in > * which case the caller would have passed tlist that is non-NIL, use > that > * function. Otherwise, use deparseTargetList. > */ This looks correct. I have modified it to make it simple in the given patch. But, I think we shouldn't refer to a function e.g. deparseExplicitTargetlist() in the comment. Instead we should describe the intent e.g. "deparse SELECT clause from the given targetlist" or "deparse SELECT clause from attr_needed". > >>> (3) I don't think we need this in isSubqueryExpr, so I removed it from >>> the >>> patch: >>> >>> + /* Keep compiler happy. */ >>> + return false; > > >> Doesn't that cause compiler warning, saying "non-void function >> returning nothing" or something like that. Actually, the "if >> (bms_is_member(node->varno, outerrel->relids))" ends with a "return" >> always. Hence we don't need to encapsulate the code in "else" block in >> else { }. It could be taken outside. > > > Yeah, I think so too, but I like the "if () { } else { }" coding. That > coding can be found in other places in core, eg, > operator_same_subexprs_lookup. OK. > >>> Done. I modified the patch as proposed; create the tlist by >>> build_tlist_to_deparse in foreign_join_ok, if needed, and search the >>> tlist >>> by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo >>> to >>> save the tlist created in foreign_join_ok. > > >> Instead of adding a new member, you might want to reuse grouped_tlist >> by renaming it. > > > Done. Right now, we are calculating tlist whether or not the ForeignPath emerges as the cheapest path. Instead we should calculate tlist, the first time we need it and then add it to the fpinfo. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 66b059a..c230009 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -102,20 +102,22 @@ typedef struct deparse_expr_cxt * foreignrel, when that represents a join or * a base relation. */ StringInfo buf; /* output buffer to append to */ List **params_list; /* exprs that will become remote Params */ } deparse_expr_cxt; #define REL_ALIAS_PREFIX "r" /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) +#define SS_TAB_ALIAS_PREFIX "s" +#define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on * remote server. */ static bool foreign_expr_walker(Node *node, foreign_glob_cxt *glob_cxt, foreign_loc_cxt *outer_cxt); static char *deparse_type_name(Oid type_oid, int32 typemod); @@ -160,20 +162,28 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); +static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, bool make_subquery, + List **params_list); +static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); +static void get_subselect_alias_id(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); +static bool is_subquery_var(Var *node, RelOptInfo *foreignrel, int *tabno, + int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context); static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, deparse_expr_cxt *context); /* @@ -854,21 +864,21 @@ List * build_tlist_to_deparse(RelOptInfo *foreignrel) { List *tlist = NIL; PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; /* * For an upper relation, we have already built the target list while * checking shippability, so just return that. */ if (foreignrel->reloptkind == RELOPT_UPPER_REL) - return fpinfo->grouped_tlist; + return fpinfo->tlist; /* * We require columns specified in
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/11/16 16:38, Etsuro Fujita wrote: On 2016/11/16 13:10, Ashutosh Bapat wrote: I don't see any reason why DML/UPDATE pushdown should depend upon subquery deparsing or least PHV patch. Combined together they can help in more cases, but without those patches, we will be able to push-down more stuff. Probably, we should just restrict push-down only for the cases when above patches are not needed. That makes reviews easy. Once those patches get committed, we may add more functionality depending upon the status of this patch. Does that make sense? OK, I'll extract from the patch the minimal part that wouldn't depend on the two patches. Here is a patch for that. Todo items are: (1) add more comments and (2) add more regression tests. I'll do that in the next version. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,142 static void deparseTargetList(StringInfo buf, Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); --- 130,154 Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(bool is_returning, ! List *tlist, ! List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); + static void deparseExplicitReturningList(List *rlist, + List **retrieved_attrs, + deparse_expr_cxt *context); + static List *make_explicit_returning_list(Index rtindex, Relation rel, + List *returningList, + List **fdw_scan_tlist); + static void pull_up_target_conditions(PlannerInfo *root, RelOptInfo *foreignrel, + Index target_rel, List **target_conds); + static void extract_target_conditions(List **joinclauses, Index target_rel, + List **target_conds); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); *** *** 164,171 static void deparseSelectSql(List *tlist, List **retrieved_attrs, static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); ! static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); --- 176,183 static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); ! static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, ! bool use_alias, Index target_rel, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); *** *** 994,1000 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) foreignrel->reloptkind == RELOPT_UPPER_REL) { /* For a join relation use the input tlist */ ! deparseExplicitTargetList(tlist, retrieved_attrs, context); } else { --- 1006,1012 foreignrel->reloptkind == RELOPT_UPPER_REL) { /* For a join relation use the input tlist */ ! deparseExplicitTargetList(false, tlist, retrieved_attrs, context); } else { *** *** 1037,1043 deparseFromExpr(List *quals, deparse_expr_cxt *context) appendStringInfoString(buf, " FROM "); deparseFromExprForRel(buf, context->root, scanrel, (bms_num_members(scanrel->relids) > 1), ! context->params_list); /* Construct WHERE clause */ if (quals != NIL) --- 1049,1055 appendStringInfoString(buf, " FROM "); deparseFromExprForRel(buf, context->root, scanrel, (bms_num_members(scanrel->relids) > 1), ! (Index) 0,
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/16 18:14, Ashutosh Bapat wrote: (1) You added the following comments to deparseSelectSql: + /* +* For a non-base relation, use the input tlist. If a base relation is +* being deparsed as a subquery, use input tlist, if the caller has passed +* one. The column aliases of the subquery are crafted based on the input +* tlist. If tlist is NIL, there will not be any expression referring to +* any of the columns of the base relation being deparsed. Hence it doesn't +* matter whether the base relation is being deparsed as subquery or not. +*/ The last sentence seems confusing to me. My understanding is: if a base relation has tlist=NIL, then the base relation *must* be deparsed as ordinary, not as a subquery. Maybe I'm missing something, though. (I left that as-is, but I think we need to reword that to be more clear, at least.) Hmm, I agree. I think the comment should just say, "Use tlist to create the SELECT clause if one has been provided. For a base relation with tlist = NIL, check attrs_needed information.". Does that sound good? I don't think that is 100% correct, because (1) tlist can be NIL for a join relation, you pointed out upthread, but we need to use deparseExplicitTargetList, so the first sentence is not completely correct, and (2) we need to check attrs_needed information not only for a baserel but for an otherrel, so the second sentence is not completely correct, either. How about this, instead?: /* * For a join relation or an upper relation, use deparseExplicitTargetList. * Likewise, for a base relation that is being deparsed as a subquery, in * which case the caller would have passed tlist that is non-NIL, use that * function. Otherwise, use deparseTargetList. */ (3) I don't think we need this in isSubqueryExpr, so I removed it from the patch: + /* Keep compiler happy. */ + return false; Doesn't that cause compiler warning, saying "non-void function returning nothing" or something like that. Actually, the "if (bms_is_member(node->varno, outerrel->relids))" ends with a "return" always. Hence we don't need to encapsulate the code in "else" block in else { }. It could be taken outside. Yeah, I think so too, but I like the "if () { } else { }" coding. That coding can be found in other places in core, eg, operator_same_subexprs_lookup. OK, I changed isSubqueryExpr to is_subquery_expr; I kept to refer to expr because I think we would soon extend that function so that it can handle PHVs, as I said upthread. For getSubselectAliasInfo, I changed the name to get_subselect_alias_id, because (1) the word "alias" seems general and (2) the "for_var" part would make the name a bit long. is_subquery_expr(Var *node -- that looks odd. Either it should is_subquery_var(Var * ... OR it should be is_subquery_expr(Expr * ... . I would prefer the first one, since that's what current patch is doing. When we introduce PHVs, we may change it, if required. OK, I used is_subquery_var(). Done. I modified the patch as proposed; create the tlist by build_tlist_to_deparse in foreign_join_ok, if needed, and search the tlist by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo to save the tlist created in foreign_join_ok. Instead of adding a new member, you might want to reuse grouped_tlist by renaming it. Done. Another idea on the "tlist" member would be to save a tlist created for EXPLAIN into that member in estimate_patch_cost_size, so that we don't need to generate the tlist again in postgresGetForeignPlan, when use_remote_estimate=true. But I'd like to leave that for another patch. I think that will happen automatically, while deparsing, whether for EXPLAIN or for actual execution. Really? Anyway, I'd like to leave that as-is. Please find attached a new version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,180 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void
Re: [HACKERS] Push down more full joins in postgres_fdw
Thanks. > except for few things; (1) You added the > following comments to deparseSelectSql: > > + /* > +* For a non-base relation, use the input tlist. If a base relation > is > +* being deparsed as a subquery, use input tlist, if the caller has > passed > +* one. The column aliases of the subquery are crafted based on the > input > +* tlist. If tlist is NIL, there will not be any expression > referring to > +* any of the columns of the base relation being deparsed. Hence it > doesn't > +* matter whether the base relation is being deparsed as subquery or > not. > +*/ > > The last sentence seems confusing to me. My understanding is: if a base > relation has tlist=NIL, then the base relation *must* be deparsed as > ordinary, not as a subquery. Maybe I'm missing something, though. (I left > that as-is, but I think we need to reword that to be more clear, at least.) > Hmm, I agree. I think the comment should just say, "Use tlist to create the SELECT clause if one has been provided. For a base relation with tlist = NIL, check attrs_needed information.". Does that sound good? > (2) You added the following comments to deparseRangeTblRef: > >> + * If make_subquery is true, deparse the relation as a subquery. >> Otherwise, >> + * deparse it as relation name with alias. > > The second sentence seems confusing to me, too, because it looks like the > relation being deparsed is assumed to be a base relation, but the relation > can be a join relation, which might join base relations, lower join > relations, and/or lower subqueries. So, I modified the sentence a bit. > OK. > (3) I don't think we need this in isSubqueryExpr, so I removed it from the > patch: > > + /* Keep compiler happy. */ > + return false; > Doesn't that cause compiler warning, saying "non-void function returning nothing" or something like that. Actually, the "if (bms_is_member(node->varno, outerrel->relids))" ends with a "return" always. Hence we don't need to encapsulate the code in "else" block in else { }. It could be taken outside. > > > OK, I changed isSubqueryExpr to is_subquery_expr; I kept to refer to expr > because I think we would soon extend that function so that it can handle > PHVs, as I said upthread. For getSubselectAliasInfo, I changed the name to > get_subselect_alias_id, because (1) the word "alias" seems general and (2) > the "for_var" part would make the name a bit long. is_subquery_expr(Var *node -- that looks odd. Either it should is_subquery_var(Var * ... OR it should be is_subquery_expr(Expr * ... . I would prefer the first one, since that's what current patch is doing. When we introduce PHVs, we may change it, if required. > > > Done. I modified the patch as proposed; create the tlist by > build_tlist_to_deparse in foreign_join_ok, if needed, and search the tlist > by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo to > save the tlist created in foreign_join_ok. > Instead of adding a new member, you might want to reuse grouped_tlist by renaming it. > Another idea on the "tlist" member would be to save a tlist created for > EXPLAIN into that member in estimate_patch_cost_size, so that we don't need > to generate the tlist again in postgresGetForeignPlan, when > use_remote_estimate=true. But I'd like to leave that for another patch. I think that will happen automatically, while deparsing, whether for EXPLAIN or for actual execution. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/11/16 13:10, Ashutosh Bapat wrote: On Wed, Nov 16, 2016 at 8:25 AM, Etsuro Fujitawrote: On 2016/11/15 19:04, Rushabh Lathia wrote: Your latest patch doesn't not get apply cleanly apply on master branch. Did you apply the patch set in [1] (postgres-fdw-subquery-support-v4.patch and postgres-fdw-phv-pushdown-v4.patch in this order) before applying the latest patch? I don't see any reason why DML/UPDATE pushdown should depend upon subquery deparsing or least PHV patch. Combined together they can help in more cases, but without those patches, we will be able to push-down more stuff. Probably, we should just restrict push-down only for the cases when above patches are not needed. That makes reviews easy. Once those patches get committed, we may add more functionality depending upon the status of this patch. Does that make sense? OK, I'll extract from the patch the minimal part that wouldn't depend on the two patches. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On Wed, Nov 16, 2016 at 8:25 AM, Etsuro Fujitawrote: > On 2016/11/15 19:04, Rushabh Lathia wrote: >> >> Thanks Fujita-san for working on this. I've signed up to review this >> patch. > > > Thank you for reviewing the patch! > >> Your latest patch doesn't not get apply cleanly apply on master branch. > > > Did you apply the patch set in [1] (postgres-fdw-subquery-support-v4.patch > and postgres-fdw-phv-pushdown-v4.patch in this order) before applying the > latest patch? > I don't see any reason why DML/UPDATE pushdown should depend upon subquery deparsing or least PHV patch. Combined together they can help in more cases, but without those patches, we will be able to push-down more stuff. Probably, we should just restrict push-down only for the cases when above patches are not needed. That makes reviews easy. Once those patches get committed, we may add more functionality depending upon the status of this patch. Does that make sense? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/11/15 19:04, Rushabh Lathia wrote: Thanks Fujita-san for working on this. I've signed up to review this patch. Thank you for reviewing the patch! Your latest patch doesn't not get apply cleanly apply on master branch. Did you apply the patch set in [1] (postgres-fdw-subquery-support-v4.patch and postgres-fdw-phv-pushdown-v4.patch in this order) before applying the latest patch? Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b642-b0e65037c76b%40lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/11 20:50, Etsuro Fujita wrote: On 2016/11/11 19:22, Ashutosh Bapat wrote: The patch looks in good shape now. Here are some comments. I have also made several changes to comments correcting grammar, typos, style and at few places logic. Let me know if the patch looks good. OK, will look into that. Done. +1 for the changes you made, except for few things; (1) You added the following comments to deparseSelectSql: + /* +* For a non-base relation, use the input tlist. If a base relation is +* being deparsed as a subquery, use input tlist, if the caller has passed +* one. The column aliases of the subquery are crafted based on the input +* tlist. If tlist is NIL, there will not be any expression referring to +* any of the columns of the base relation being deparsed. Hence it doesn't +* matter whether the base relation is being deparsed as subquery or not. +*/ The last sentence seems confusing to me. My understanding is: if a base relation has tlist=NIL, then the base relation *must* be deparsed as ordinary, not as a subquery. Maybe I'm missing something, though. (I left that as-is, but I think we need to reword that to be more clear, at least.) (2) You added the following comments to deparseRangeTblRef: > + * If make_subquery is true, deparse the relation as a subquery. Otherwise, > + * deparse it as relation name with alias. The second sentence seems confusing to me, too, because it looks like the relation being deparsed is assumed to be a base relation, but the relation can be a join relation, which might join base relations, lower join relations, and/or lower subqueries. So, I modified the sentence a bit. (3) I don't think we need this in isSubqueryExpr, so I removed it from the patch: + /* Keep compiler happy. */ + return false; Also, I revised comments you added a little bit. I guess, below code + if (!fpinfo->subquery_rels) + return false; can be changed to if (!bms_is_member(node->varno, fpinfo->subquery_rels)) return false; Also the return values from the recursive calls to isSubqueryExpr() can be returned as is. I have included this change in the patch. Will look into that too. Done. That's a good idea! deparse.c seems to be using capitalized names for function which actually deparse something and an non-capitalized form for helper functions. From that perspective attached patch renames isSubqueryExpr as is_subquery_var() and getSubselectAliasInfo() as get_alias_id_for_var(). Actually both these functions accept a Var node but somehow their names refer to expr. OK, I changed isSubqueryExpr to is_subquery_expr; I kept to refer to expr because I think we would soon extend that function so that it can handle PHVs, as I said upthread. For getSubselectAliasInfo, I changed the name to get_subselect_alias_id, because (1) the word "alias" seems general and (2) the "for_var" part would make the name a bit long. This patch is using make_tlist_from_pathtarget() to create tlist to be deparsed but search in RelOptInfo::reltarget::exprs for a given Var. As long as the relations deparsed do not carry expressions, this might work, but it will certainly break once we start deparsing relations with expressions since the upper most query's tlist contains only Vars. Instead, we should probably, create tlist and save it in fpinfo and use it later for searching (tlist_member()?). Possibly use using build_tlist_to_deparse(), to create the tlist similar so that targetlist list creation logic is same for all the relations being deparsed. I haven't included this change in the patch. Seems like a good idea. Will revise. Done. I modified the patch as proposed; create the tlist by build_tlist_to_deparse in foreign_join_ok, if needed, and search the tlist by tlist_member. I also added a new member "tlist" to PgFdwRelationInfo to save the tlist created in foreign_join_ok. Another idea on the "tlist" member would be to save a tlist created for EXPLAIN into that member in estimate_patch_cost_size, so that we don't need to generate the tlist again in postgresGetForeignPlan, when use_remote_estimate=true. But I'd like to leave that for another patch. Please find attached an updated version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,180 static void
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
Thanks Fujita-san for working on this. I've signed up to review this patch. Your latest patch doesn't not get apply cleanly apply on master branch. patching file contrib/postgres_fdw/deparse.c 6 out of 17 hunks FAILED -- saving rejects to file contrib/postgres_fdw/deparse.c.rej patching file contrib/postgres_fdw/expected/postgres_fdw.out 2 out of 2 hunks FAILED -- saving rejects to file contrib/postgres_fdw/expected/postgres_fdw.out.rej patching file contrib/postgres_fdw/postgres_fdw.c 2 out of 22 hunks FAILED -- saving rejects to file contrib/postgres_fdw/postgres_fdw.c.rej patching file contrib/postgres_fdw/postgres_fdw.h 1 out of 3 hunks FAILED -- saving rejects to file contrib/postgres_fdw/postgres_fdw.h.rej Please share the patch which get apply clean on master branch. On Fri, Nov 11, 2016 at 5:00 PM, Etsuro Fujitawrote: > On 2016/09/08 19:55, Etsuro Fujita wrote: > >> On 2016/09/07 13:21, Ashutosh Bapat wrote: >> >>> * with the patch: >>> postgres=# explain verbose delete from ft1 using ft2 where ft1.a = >>> ft2.a; >>> QUERY PLAN >>> >>> >>> - >>> >>> Delete on public.ft1 (cost=100.00..102.04 rows=1 width=38) >>>-> Foreign Delete (cost=100.00..102.04 rows=1 width=38) >>> Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a, >>> b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2)) >>> (3 rows) >>> >> >> The underlying scan on t2 requires ROW(a,b) for locking the row for >>> update/share. But clearly it's not required if the full query is being >>> pushed down. >>> >> > Is there a way we can detect that ROW(a,b) is useless >>> column (not used anywhere in the other parts of the query like >>> RETURNING, DELETE clause etc.) and eliminate it? >>> >> > I don't have a clear solution for that yet, but I'll try to remove that >> in the next version. >> > > Similarly for a, it's >>> part of the targetlist of the underlying scan so that the WHERE clause >>> can be applied on it. But it's not needed if we are pushing down the >>> query. If we eliminate the targetlist of the query, we could construct a >>> remote query without having subquery in it, making it more readable. >>> >> > Will try to do so also. >> > > I addressed this by improving the deparse logic so that a remote query for > performing an UPDATE/DELETE on a join directly on the remote can be created > as proposed if possible. Attached is an updated version of the patch, > which is created on top of the patch set [1]. The patch is still WIP (ie, > needs more comments and regression tests, at least), but any comments would > be gratefully appreciated. > > Best regards, > Etsuro Fujita > > [1] https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b64 > 2-b0e65037c76b%40lab.ntt.co.jp > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > -- Rushabh Lathia
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/11 19:22, Ashutosh Bapat wrote: The patch looks in good shape now. Thanks for the review! The patch looks in good shape now. Here are some comments. I have also made several changes to comments correcting grammar, typos, style and at few places logic. Let me know if the patch looks good. OK, will look into that. I guess, below code + if (!fpinfo->subquery_rels) + return false; can be changed to if (!bms_is_member(node->varno, fpinfo->subquery_rels)) return false; Also the return values from the recursive calls to isSubqueryExpr() can be returned as is. I have included this change in the patch. Will look into that too. deparse.c seems to be using capitalized names for function which actually deparse something and an non-capitalized form for helper functions. That's not true. There is a function named classifyConditions(). The function naming in deparse.c is a bit arbitrary. From that perspective attached patch renames isSubqueryExpr as is_subquery_var() and getSubselectAliasInfo() as get_alias_id_for_var(). Actually both these functions accept a Var node but somehow their names refer to expr. The reason why I named that function isSubqueryExpr is that I think that function would be soon extended so as to handle PHVs. See another patch for evaluating PHVs remotely. This patch is using make_tlist_from_pathtarget() to create tlist to be deparsed but search in RelOptInfo::reltarget::exprs for a given Var. As long as the relations deparsed do not carry expressions, this might work, but it will certainly break once we start deparsing relations with expressions since the upper most query's tlist contains only Vars. Instead, we should probably, create tlist and save it in fpinfo and use it later for searching (tlist_member()?). Possibly use using build_tlist_to_deparse(), to create the tlist similar so that targetlist list creation logic is same for all the relations being deparsed. I haven't included this change in the patch. Seems like a good idea. Will revise. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more UPDATEs/DELETEs in postgres_fdw
On 2016/09/08 19:55, Etsuro Fujita wrote: On 2016/09/07 13:21, Ashutosh Bapat wrote: * with the patch: postgres=# explain verbose delete from ft1 using ft2 where ft1.a = ft2.a; QUERY PLAN - Delete on public.ft1 (cost=100.00..102.04 rows=1 width=38) -> Foreign Delete (cost=100.00..102.04 rows=1 width=38) Remote SQL: DELETE FROM public.t1 r1 USING (SELECT ROW(a, b), a FROM public.t2) ss1(c1, c2) WHERE ((r1.a = ss1.c2)) (3 rows) The underlying scan on t2 requires ROW(a,b) for locking the row for update/share. But clearly it's not required if the full query is being pushed down. Is there a way we can detect that ROW(a,b) is useless column (not used anywhere in the other parts of the query like RETURNING, DELETE clause etc.) and eliminate it? I don't have a clear solution for that yet, but I'll try to remove that in the next version. Similarly for a, it's part of the targetlist of the underlying scan so that the WHERE clause can be applied on it. But it's not needed if we are pushing down the query. If we eliminate the targetlist of the query, we could construct a remote query without having subquery in it, making it more readable. Will try to do so also. I addressed this by improving the deparse logic so that a remote query for performing an UPDATE/DELETE on a join directly on the remote can be created as proposed if possible. Attached is an updated version of the patch, which is created on top of the patch set [1]. The patch is still WIP (ie, needs more comments and regression tests, at least), but any comments would be gratefully appreciated. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/11eafd10-d3f8-ac8a-b642-b0e65037c76b%40lab.ntt.co.jp *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 133,145 static void deparseTargetList(StringInfo buf, Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); --- 133,164 Bitmapset *attrs_used, bool qualify_col, List **retrieved_attrs); ! static void deparseExplicitTargetList(bool is_returning, ! List *tlist, ! List **retrieved_attrs, deparse_expr_cxt *context); static void deparseReturningList(StringInfo buf, PlannerInfo *root, Index rtindex, Relation rel, bool trig_after_row, List *returningList, List **retrieved_attrs); + static void deparseExplicitReturningList(List *rlist, + List **retrieved_attrs, + deparse_expr_cxt *context); + static List *makeExplicitReturningList(Index rtindex, Relation rel, + List *returningList, + List **fdw_scan_tlist, + Relids *wholerows); + static void rewriteFromExprForRel(PlannerInfo *root, RelOptInfo *foreignrel, + Index target_rel, List **target_conds, Relids wholerows); + static List *pullUpTargetConds(RelOptInfo *foreignrel, List *joinclauses, + Index target_rel, List **target_conds); + static bool pullUpSubquery(PlannerInfo *root, RelOptInfo *foreignrel, JoinType jointype, + Index target_rel, List **target_conds, Relids wholerows); + static bool isSimpleSubquery(PlannerInfo *root, RelOptInfo *baserel, JoinType jointype, + Relids wholerows); + static List *getCleanSubqueryExprs(PlannerInfo *root, RelOptInfo *foreignrel, + List *subquery_exprs, Relids wholerows); static void deparseColumnRef(StringInfo buf, int varno, int varattno, PlannerInfo *root, bool qualify_col); static void deparseRelation(StringInfo buf, Relation rel); *** *** 168,178 static void deparseSelectSql(List *tlist, List **retrieved_attrs, static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); ! static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, ! bool make_subquery, List **params_list); static void appendSubselectAlias(StringInfo
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Nov 7, 2016 at 5:50 PM, Etsuro Fujitawrote: > On 2016/11/07 11:24, Etsuro Fujita wrote: >> >> On 2016/11/04 19:55, Etsuro Fujita wrote: >>> >>> Attached is an updated version of the patch. > > >> I noticed that I have included an unrelated regression test in the >> patch. Attached is a patch with the test removed. > > > I noticed that I inadvertently removed some changes from that patch, so I > fixed that. Please find attached an updated version of the patch. I'm also > attaching an updated version of another patch for evaluating PHVs remotely, > which has been created on top of that patch. Changes to the latter: I > revised comments and added a bit more regression tests. The patch looks in good shape now. Here are some comments. I have also made several changes to comments correcting grammar, typos, style and at few places logic. Let me know if the patch looks good. I guess, below code + if (!fpinfo->subquery_rels) + return false; can be changed to if (!bms_is_member(node->varno, fpinfo->subquery_rels)) return false; Also the return values from the recursive calls to isSubqueryExpr() can be returned as is. I have included this change in the patch. deparse.c seems to be using capitalized names for function which actually deparse something and an non-capitalized form for helper functions. From that perspective attached patch renames isSubqueryExpr as is_subquery_var() and getSubselectAliasInfo() as get_alias_id_for_var(). Actually both these functions accept a Var node but somehow their names refer to expr. This patch is using make_tlist_from_pathtarget() to create tlist to be deparsed but search in RelOptInfo::reltarget::exprs for a given Var. As long as the relations deparsed do not carry expressions, this might work, but it will certainly break once we start deparsing relations with expressions since the upper most query's tlist contains only Vars. Instead, we should probably, create tlist and save it in fpinfo and use it later for searching (tlist_member()?). Possibly use using build_tlist_to_deparse(), to create the tlist similar so that targetlist list creation logic is same for all the relations being deparsed. I haven't included this change in the patch. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 66b059a..9946f8b 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -102,20 +102,22 @@ typedef struct deparse_expr_cxt * foreignrel, when that represents a join or * a base relation. */ StringInfo buf; /* output buffer to append to */ List **params_list; /* exprs that will become remote Params */ } deparse_expr_cxt; #define REL_ALIAS_PREFIX "r" /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) +#define SS_TAB_ALIAS_PREFIX "s" +#define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on * remote server. */ static bool foreign_expr_walker(Node *node, foreign_glob_cxt *glob_cxt, foreign_loc_cxt *outer_cxt); static char *deparse_type_name(Oid type_oid, int32 typemod); @@ -160,20 +162,26 @@ static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); +static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); +static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); +static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); +static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, deparse_expr_cxt *context); static void appendFunctionName(Oid funcid, deparse_expr_cxt *context); static Node *deparseSortGroupClause(Index ref, List *tlist, deparse_expr_cxt *context); /* @@ -983,26 +991,32 @@ deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/07 11:24, Etsuro Fujita wrote: On 2016/11/04 19:55, Etsuro Fujita wrote: Attached is an updated version of the patch. I noticed that I have included an unrelated regression test in the patch. Attached is a patch with the test removed. I noticed that I inadvertently removed some changes from that patch, so I fixed that. Please find attached an updated version of the patch. I'm also attaching an updated version of another patch for evaluating PHVs remotely, which has been created on top of that patch. Changes to the latter: I revised comments and added a bit more regression tests. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,180 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); + static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 990,999 deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context) */ appendStringInfoString(buf, "SELECT "); if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL) { ! /* For a join relation use the input tlist */ deparseExplicitTargetList(tlist, retrieved_attrs, context); } else --- 998,1013 */ appendStringInfoString(buf, "SELECT "); + /* + * Note: non-NIL tlist might be supplied for a baserel; if the baserel is + * deparsed as a subquery, non-NIL tlist will be supplied for the baserel. + * (See deparseRangeTblRef.) In that case use deparseExplicitTargetList. + */ if (foreignrel->reloptkind == RELOPT_JOINREL || ! foreignrel->reloptkind == RELOPT_UPPER_REL || ! tlist != NIL) { ! /* Use the input tlist */ deparseExplicitTargetList(tlist, retrieved_attrs, context); } else *** *** 1155,1165 deparseLockingClause(deparse_expr_cxt *context) --- 1169,1188 StringInfo buf = context->buf; PlannerInfo *root = context->root; RelOptInfo *rel = context->scanrel; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; int relid = -1; while ((relid = bms_next_member(rel->relids, relid)) >= 0) { /* + * Ignore relation if it appears in a lower subquery, because in that + * case we would have already considered locking for the relation + * while deparsing the lower subquery. + */ + if (bms_is_member(relid, fpinfo->subquery_rels)) + continue; + + /* * Add FOR UPDATE/SHARE if appropriate. We apply locking during the * initial row fetch, rather than later on as is done for local * tables. The extra roundtrips involved in trying to duplicate the *** *** 1347,1364 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, if (foreignrel->reloptkind == RELOPT_JOINREL) { - RelOptInfo *rel_o = fpinfo->outerrel; - RelOptInfo *rel_i = fpinfo->innerrel; StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ initStringInfo(_sql_o); ! deparseFromExprForRel(_sql_o, root, rel_o, true, params_list); /* Deparse inner relation */ initStringInfo(_sql_i); ! deparseFromExprForRel(_sql_i, root, rel_i, true, params_list); /* * For a join relation FROM clause entry is deparsed as --- 1370,1391 if (foreignrel->reloptkind == RELOPT_JOINREL) { StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ initStringInfo(_sql_o); ! deparseRangeTblRef(_sql_o, root, ! fpinfo->outerrel, ! fpinfo->make_outerrel_subquery, ! params_list); /* Deparse inner relation */ initStringInfo(_sql_i); ! deparseRangeTblRef(_sql_i, root, ! fpinfo->innerrel, !
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/04 19:55, Etsuro Fujita wrote: Attached is an updated version of the patch. I noticed that I have included an unrelated regression test in the patch. Attached is a patch with the test removed. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,180 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); + static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 1155,1165 deparseLockingClause(deparse_expr_cxt *context) --- 1163,1182 StringInfo buf = context->buf; PlannerInfo *root = context->root; RelOptInfo *rel = context->scanrel; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; int relid = -1; while ((relid = bms_next_member(rel->relids, relid)) >= 0) { /* + * Ignore relation if it appears in a lower subquery, because in that + * case we would have already considered locking for the relation + * while deparsing the lower subquery. + */ + if (bms_is_member(relid, fpinfo->subquery_rels)) + continue; + + /* * Add FOR UPDATE/SHARE if appropriate. We apply locking during the * initial row fetch, rather than later on as is done for local * tables. The extra roundtrips involved in trying to duplicate the *** *** 1347,1364 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, if (foreignrel->reloptkind == RELOPT_JOINREL) { - RelOptInfo *rel_o = fpinfo->outerrel; - RelOptInfo *rel_i = fpinfo->innerrel; StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ initStringInfo(_sql_o); ! deparseFromExprForRel(_sql_o, root, rel_o, true, params_list); /* Deparse inner relation */ initStringInfo(_sql_i); ! deparseFromExprForRel(_sql_i, root, rel_i, true, params_list); /* * For a join relation FROM clause entry is deparsed as --- 1364,1385 if (foreignrel->reloptkind == RELOPT_JOINREL) { StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ initStringInfo(_sql_o); ! deparseRangeTblRef(_sql_o, root, ! fpinfo->outerrel, ! fpinfo->make_outerrel_subquery, ! params_list); /* Deparse inner relation */ initStringInfo(_sql_i); ! deparseRangeTblRef(_sql_i, root, ! fpinfo->innerrel, ! fpinfo->make_innerrel_subquery, ! params_list); /* * For a join relation FROM clause entry is deparsed as *** *** 1414,1419 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, --- 1435,1585 } /* + * Append operand relation of foreign join to buf. + */ + static void + deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list) + { + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; + + Assert(foreignrel->reloptkind == RELOPT_BASEREL || + foreignrel->reloptkind == RELOPT_JOINREL); + Assert(fpinfo->local_conds == NIL); + + if (make_subquery) + { + List *tlist; + List *retrieved_attrs; + + tlist = make_tlist_from_pathtarget(foreignrel->reltarget); + Assert(tlist != NIL); + appendStringInfoChar(buf, '('); + deparseSelectStmtForRel(buf, root, foreignrel, tlist, + fpinfo->remote_conds, NIL, + _attrs, params_list); + appendStringInfoChar(buf, ')'); + appendSubselectAlias(buf, fpinfo->relation_index, + list_length(foreignrel->reltarget->exprs)); + } + else + deparseFromExprForRel(buf, root, foreignrel, true, params_list); + } + + /* + * Add a subselect alias to a subquery-in-FROM. + * + * The
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/04 13:08, Ashutosh Bapat wrote: On Fri, Nov 4, 2016 at 9:19 AM, Etsuro Fujitawrote: On 2016/11/03 18:52, Ashutosh Bapat wrote: I wrote: Here is the updated version, Other than the above issue and the alias issue we discussed, I addressed all your comments except one on testing; I tried to add test cases where the remote query is deparsed as nested subqueries, but I couldn't because IIUC, reduce_outer_joins reduced full joins to inner joins or left joins. No always. It will do so in some cases as explained in the prologue of reduce_outer_joins() * More generally, an outer join can be reduced in strength if there is a * strict qual above it in the qual tree that constrains a Var from the * nullable side of the join to be non-null. (For FULL joins this applies * to each side separately.) Right. Would it be necessary for this patch to include test cases for nested subqueries? A patch should have testcases testing the functionality added. This patch adds functionality to deparse nested subqueries, so there should be testcase to test it. OK, I added such a test case. This patch again has a lot of unrelated changes, esp. in deparseSelectStmtForRel(). What was earlier part of deparseSelectSql() and deparseFromExpr() is now flattened in deparseSelectStmtForRel(), which seems unnecessary. IIUC, I think this change was done to address your comment (see the comment #2 in [1]). Am I missing something? There is some misunderstanding here. That comment basically says, deparseRangeTblRef() duplicates code in deparseSelectStmtForRel(), so we should either remove deparseRangeTblRef() altogether or should keep it to minimal avoiding duplication of code. What might have confused you is the last sentence in that comment "This will also make the current changes to deparseSelectSql unnecessary." By current changes I meant changes to deparseSelectSql() in your patch, not the one that's in the repository. I don't think we should flatten deparseSelectStmtForRel() in this patch. I noticed that I misunderstood your words. Sorry about that. I agree with you, so I removed that change from the patch. Attached is an updated version of the patch. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 167,172 static void appendConditions(List *exprs, deparse_expr_cxt *context); --- 169,180 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); static void deparseFromExpr(List *quals, deparse_expr_cxt *context); + static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); + static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); + static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, + int *tabno, int *colno); + static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 1155,1165 deparseLockingClause(deparse_expr_cxt *context) --- 1163,1182 StringInfo buf = context->buf; PlannerInfo *root = context->root; RelOptInfo *rel = context->scanrel; + PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; int relid = -1; while ((relid = bms_next_member(rel->relids, relid)) >= 0) { /* + * Ignore relation if it appears in a lower subquery, because in that + * case we would have already considered locking for the relation + * while deparsing the lower subquery. + */ + if (bms_is_member(relid, fpinfo->subquery_rels)) + continue; + + /* * Add FOR UPDATE/SHARE if appropriate. We apply locking during the * initial row fetch, rather than later on as is done for local * tables. The extra roundtrips involved in trying to duplicate the *** *** 1347,1364 deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, if (foreignrel->reloptkind == RELOPT_JOINREL) { - RelOptInfo *rel_o = fpinfo->outerrel; - RelOptInfo *rel_i = fpinfo->innerrel; StringInfoData join_sql_o; StringInfoData join_sql_i; /* Deparse outer relation */ initStringInfo(_sql_o); ! deparseFromExprForRel(_sql_o, root, rel_o, true,
Re: [HACKERS] Push down more full joins in postgres_fdw
On Fri, Nov 4, 2016 at 9:19 AM, Etsuro Fujitawrote: > On 2016/11/03 18:52, Ashutosh Bapat wrote: >>> >>> Here is the updated version, which includes the restructuring you >>> proposed. >>> Other than the above issue and the alias issue we discussed, I addressed >>> all >>> your comments except one on testing; I tried to add test cases where the >>> remote query is deparsed as nested subqueries, but I couldn't because >>> IIUC, >>> reduce_outer_joins reduced full joins to inner joins or left joins. > > >> No always. It will do so in some cases as explained in the prologue of >> reduce_outer_joins() >> >> * More generally, an outer join can be reduced in strength if there is a >> * strict qual above it in the qual tree that constrains a Var from the >> * nullable side of the join to be non-null. (For FULL joins this applies >> * to each side separately.) > > > Hmm. Would it be necessary for this patch to include test cases for nested > subqueries? As mentioned in a previous email, those test cases can be added > by the split patch that allow PHVs to be evaluated on the remote side. A patch should have testcases testing the functionality added. This patch adds functionality to deparse nested subqueries, so there should be testcase to test it. If we can not come up with a testcase, then it's very much possible that the code related to that functionality is not needed. PHV is a separate patch and we do not know whether it will be committed or when it will be committed after this patch is committed. It's better to have self-sufficient patch. > >> This patch again has a lot of unrelated changes, esp. in >> deparseSelectStmtForRel(). What was earlier part of deparseSelectSql() >> and deparseFromExpr() is now flattened in deparseSelectStmtForRel(), >> which seems unnecessary. > > > IIUC, I think this change was done to address your comment (see the comment > #2 in [1]). Am I missing something? There is some misunderstanding here. That comment basically says, deparseRangeTblRef() duplicates code in deparseSelectStmtForRel(), so we should either remove deparseRangeTblRef() altogether or should keep it to minimal avoiding duplication of code. What might have confused you is the last sentence in that comment "This will also make the current changes to deparseSelectSql unnecessary." By current changes I meant changes to deparseSelectSql() in your patch, not the one that's in the repository. I don't think we should flatten deparseSelectStmtForRel() in this patch. > >> Or there are changes like >> >> -deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo >> *rel, >> - List *tlist, List *remote_conds, List *pathkeys, >> +deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, >> + RelOptInfo *foreignrel, List *tlist, >> + List *remote_conds, List *pathkeys, >> >> which arise because rel has been renamed as foreignrel. The patch will >> work even without such renaming. > > > I did that because we have a Relation named rel (the same name!) within that > function, to execute deparseTargetList in a base-relation case. That's because you have flattened deparseSelectStmtForRel(). If we don't flatten it out, the variable name conflict doesn't arise. > Another > reason for that is because (1) that would match with other function > definitions in deparse.c and (2) that would be consistent with the existing > function definition for that function in postgres_fdw.h. > That would be a separate patch, I guess. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/11/03 18:52, Ashutosh Bapat wrote: Here is the updated version, which includes the restructuring you proposed. Other than the above issue and the alias issue we discussed, I addressed all your comments except one on testing; I tried to add test cases where the remote query is deparsed as nested subqueries, but I couldn't because IIUC, reduce_outer_joins reduced full joins to inner joins or left joins. No always. It will do so in some cases as explained in the prologue of reduce_outer_joins() * More generally, an outer join can be reduced in strength if there is a * strict qual above it in the qual tree that constrains a Var from the * nullable side of the join to be non-null. (For FULL joins this applies * to each side separately.) Hmm. Would it be necessary for this patch to include test cases for nested subqueries? As mentioned in a previous email, those test cases can be added by the split patch that allow PHVs to be evaluated on the remote side. This patch again has a lot of unrelated changes, esp. in deparseSelectStmtForRel(). What was earlier part of deparseSelectSql() and deparseFromExpr() is now flattened in deparseSelectStmtForRel(), which seems unnecessary. IIUC, I think this change was done to address your comment (see the comment #2 in [1]). Am I missing something? Or there are changes like -deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, - List *tlist, List *remote_conds, List *pathkeys, +deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, List *tlist, + List *remote_conds, List *pathkeys, which arise because rel has been renamed as foreignrel. The patch will work even without such renaming. I did that because we have a Relation named rel (the same name!) within that function, to execute deparseTargetList in a base-relation case. Another reason for that is because (1) that would match with other function definitions in deparse.c and (2) that would be consistent with the existing function definition for that function in postgres_fdw.h. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAFjFpRfwoSsJr9418b2jA7g0nwagjZSWhPQPUmK9M6z5XSOAqQ%40mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> Here is the updated version, which includes the restructuring you proposed. > Other than the above issue and the alias issue we discussed, I addressed all > your comments except one on testing; I tried to add test cases where the > remote query is deparsed as nested subqueries, but I couldn't because IIUC, > reduce_outer_joins reduced full joins to inner joins or left joins. No always. It will do so in some cases as explained in the prologue of reduce_outer_joins() * More generally, an outer join can be reduced in strength if there is a * strict qual above it in the qual tree that constrains a Var from the * nullable side of the join to be non-null. (For FULL joins this applies * to each side separately.) > So, I > added two test cases: (1) the joining relations are both base relations > (actually, we already have that) and (2) one of the joining relations is a > base relation and the other is a join relation. I rebased the patch to > HEAD, so I added a test case with aggregate pushdown also. > This patch again has a lot of unrelated changes, esp. in deparseSelectStmtForRel(). What was earlier part of deparseSelectSql() and deparseFromExpr() is now flattened in deparseSelectStmtForRel(), which seems unnecessary. Or there are changes like -deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, - List *tlist, List *remote_conds, List *pathkeys, +deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, + RelOptInfo *foreignrel, List *tlist, + List *remote_conds, List *pathkeys, which arise because rel has been renamed as foreignrel. The patch will work even without such renaming. I think such refactoring, should be done in a separate patch. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/27 20:41, Etsuro Fujita wrote: Here is the updated version, which includes the restructuring you proposed. Other than the above issue and the alias issue we discussed, I addressed all your comments except one on testing; I tried to add test cases where the remote query is deparsed as nested subqueries, but I couldn't because IIUC, reduce_outer_joins reduced full joins to inner joins or left joins. So, I added two test cases: (1) the joining relations are both base relations (actually, we already have that) and (2) one of the joining relations is a base relation and the other is a join relation. I rebased the patch to HEAD, so I added a test case with aggregate pushdown also. I've updated another patch for evaluating PHVs remotely. Attached is an updated version of the patch, which is created on top of the patch for supporting deparsing subqueries. You can find test cases where a remote join query is deparsed as nested subqueries. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 49,54 --- 49,55 #include "nodes/nodeFuncs.h" #include "nodes/plannodes.h" #include "optimizer/clauses.h" + #include "optimizer/placeholder.h" #include "optimizer/prep.h" #include "optimizer/tlist.h" #include "optimizer/var.h" *** *** 157,162 static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context); --- 158,164 static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context); static void deparseNullTest(NullTest *node, deparse_expr_cxt *context); static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context); + static void deparseExprInPlaceHolderVar(PlaceHolderVar *node, deparse_expr_cxt *context); static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, *** *** 169,177 static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, bool make_subquery, List **params_list); static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); ! static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); ! static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, --- 171,180 static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, bool make_subquery, List **params_list); static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); ! static void getSubselectAliasInfo(Expr *node, RelOptInfo *foreignrel, int *tabno, int *colno); ! static bool isSubqueryExpr(Expr *node, PlannerInfo *root, RelOptInfo *foreignrel, ! int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 762,767 foreign_expr_walker(Node *node, --- 765,789 state = FDW_COLLATE_UNSAFE; } break; + case T_PlaceHolderVar: + { + PlaceHolderVar *phv = (PlaceHolderVar *) node; + PlaceHolderInfo *phinfo = find_placeholder_info(glob_cxt->root, + phv, false); + + /* + * If the PHV's contained expression is computable on the + * remote server, we consider the PHV safe to send. + */ + if (phv->phlevelsup == 0 && + bms_is_subset(phinfo->ph_eval_at, + glob_cxt->foreignrel->relids)) + return foreign_expr_walker((Node *) phv->phexpr, + glob_cxt, outer_cxt); + else + return false; + } + break; default: /* *** *** 856,862 deparse_type_name(Oid type_oid, int32 typemod) * foreign server. */ List * ! build_tlist_to_deparse(RelOptInfo *foreignrel) { List *tlist = NIL; PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; --- 878,884 * foreign server. */ List * ! build_tlist_to_deparse(PlannerInfo *root, RelOptInfo *foreignrel) { List *tlist = NIL; PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private; *** *** 869,883 build_tlist_to_deparse(RelOptInfo *foreignrel) return fpinfo->grouped_tlist; /* ! * We require columns specified in foreignrel->reltarget->exprs and those ! * required for evaluating the local conditions. */ ! tlist = add_to_flat_tlist(tlist, ! pull_var_clause((Node *) foreignrel->reltarget->exprs, !
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/27 18:06, Ashutosh Bapat wrote: On Thu, Oct 27, 2016 at 12:46 PM, Etsuro Fujitawrote: On 2016/10/26 19:53, Ashutosh Bapat wrote: On Wed, Oct 26, 2016 at 3:35 PM, Etsuro Fujita My concern about your proposal is: it might not be worth complicating the code to solve a problem that is actually not a problem in practice. To me the current code looks complicated esp. because of the recursion involved and usage of out parameters to isSubqueryExpr(). I don't think so. isSubqueryExpr is prety small, written in less than 50 lines, and the code looks rather simple to me. My suggestion goes inline with the current method of deparsing a Var. Yeah, I think your approach makes it easy to search for the alias to a given Var from the array you proposed. I think the complexity of your approach would be in extra work for building and maintaining the array while deparsing the query. I think that would probably need more invasive and much larger changes to the existing code than what I proposed. I think this issue is optional, so I'd like to propose to leave this for the committer's judge. Fine with me. OK Here is the updated version, which includes the restructuring you proposed. Other than the above issue and the alias issue we discussed, I addressed all your comments except one on testing; I tried to add test cases where the remote query is deparsed as nested subqueries, but I couldn't because IIUC, reduce_outer_joins reduced full joins to inner joins or left joins. So, I added two test cases: (1) the joining relations are both base relations (actually, we already have that) and (2) one of the joining relations is a base relation and the other is a join relation. I rebased the patch to HEAD, so I added a test case with aggregate pushdown also. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 109,114 typedef struct deparse_expr_cxt --- 109,116 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 159,172 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); - static void deparseSelectSql(List *tlist, List **retrieved_attrs, - deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); ! static void deparseFromExpr(List *quals, deparse_expr_cxt *context); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, --- 161,177 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *joinrel, bool use_alias, List **params_list); ! static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, ! bool make_subquery, List **params_list); ! static void appendSubselectAlias(StringInfo buf, int tabno, int ncols); ! static void getSubselectAliasInfo(Var *node, RelOptInfo *foreignrel, ! int *tabno, int *colno); ! static bool isSubqueryExpr(Var *node, RelOptInfo *foreignrel, int *tabno, int *colno); static void deparseAggref(Aggref *node, deparse_expr_cxt *context); static void appendGroupByClause(List *tlist, deparse_expr_cxt *context); static void appendAggOrderBy(List *orderList, List *targetList, *** *** 899,1000 build_tlist_to_deparse(RelOptInfo *foreignrel) * List of columns selected is returned in retrieved_attrs. */ extern void ! deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, ! List *tlist, List *remote_conds, List *pathkeys, List **retrieved_attrs, List **params_list) { deparse_expr_cxt context; ! PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) rel->fdw_private; List *quals; /* * We handle relations for foreign tables, joins
Re: [HACKERS] Push down more full joins in postgres_fdw
On Thu, Oct 27, 2016 at 12:46 PM, Etsuro Fujitawrote: > On 2016/10/26 19:53, Ashutosh Bapat wrote: >> >> On Wed, Oct 26, 2016 at 3:35 PM, Etsuro Fujita >> wrote: > > >>> In practice, the search cost would be negligible compared to the cost of >>> explaining/executing the query. >>> >>> My concern about your proposal is: it might not be worth complicating the >>> code to solve a problem that is actually not a problem in practice. > > >> To me the current code looks complicated esp. because of the recursion >> involved and usage of out parameters to isSubqueryExpr(). > > > I don't think so. isSubqueryExpr is prety small, written in less than 50 > lines, and the code looks rather simple to me. > >> My >> suggestion goes inline with the current method of deparsing a Var. > > > Yeah, I think your approach makes it easy to search for the alias to a given > Var from the array you proposed. I think the complexity of your approach > would be in extra work for building and maintaining the array while > deparsing the query. I think that would probably need more invasive and > much larger changes to the existing code than what I proposed. > > I think this issue is optional, so I'd like to propose to leave this for the > committer's judge. > Fine with me. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/26 19:53, Ashutosh Bapat wrote: On Wed, Oct 26, 2016 at 3:35 PM, Etsuro Fujitawrote: In practice, the search cost would be negligible compared to the cost of explaining/executing the query. My concern about your proposal is: it might not be worth complicating the code to solve a problem that is actually not a problem in practice. To me the current code looks complicated esp. because of the recursion involved and usage of out parameters to isSubqueryExpr(). I don't think so. isSubqueryExpr is prety small, written in less than 50 lines, and the code looks rather simple to me. My suggestion goes inline with the current method of deparsing a Var. Yeah, I think your approach makes it easy to search for the alias to a given Var from the array you proposed. I think the complexity of your approach would be in extra work for building and maintaining the array while deparsing the query. I think that would probably need more invasive and much larger changes to the existing code than what I proposed. I think this issue is optional, so I'd like to propose to leave this for the committer's judge. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Wed, Oct 26, 2016 at 3:35 PM, Etsuro Fujitawrote: > On 2016/10/26 18:25, Ashutosh Bapat wrote: > >> Your patch calls isSubqueryExpr() recursively for every Var in the >> targetlist, which can be many for foreign tables with many columns. >> For every such Var it may need to reach upto the relation which is >> converted into subquery, which can as bad as reaching every base >> relation. So, it looks like the number of recursive calls to >> isSubqueryExpr() is bounded by V * N (i.e. worst case depth of the >> RelOptInfo tree), which can be quite costly. If use_remote_estimates >> is true, we do this for every intermediate relation and for every path >> created. That isn't very performance efficient. > > > In practice, the search cost would be negligible compared to the cost of > explaining/executing the query. > > My concern about your proposal is: it might not be worth complicating the > code to solve a problem that is actually not a problem in practice. > To me the current code looks complicated esp. because of the recursion involved and usage of out parameters to isSubqueryExpr(). My suggestion goes inline with the current method of deparsing a Var. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/26 18:25, Ashutosh Bapat wrote: Your patch calls isSubqueryExpr() recursively for every Var in the targetlist, which can be many for foreign tables with many columns. For every such Var it may need to reach upto the relation which is converted into subquery, which can as bad as reaching every base relation. So, it looks like the number of recursive calls to isSubqueryExpr() is bounded by V * N (i.e. worst case depth of the RelOptInfo tree), which can be quite costly. If use_remote_estimates is true, we do this for every intermediate relation and for every path created. That isn't very performance efficient. In practice, the search cost would be negligible compared to the cost of explaining/executing the query. My concern about your proposal is: it might not be worth complicating the code to solve a problem that is actually not a problem in practice. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> I guess, the arrays need to be computed only once for any relation when the query for that relation is deparsed the first time. > > >>> Does this algorithm extend to the case where we consider paths for every >>> join order? > > >> Yes, if we store the information about which of relations need >> subquery and which don't for every join order. > > > Hmm. Sorry, I'm not so excited about this proposal. I think (1) that is > solving a problem that hasn't been proven to be a problem, (2) that would > complicate the deparser logic, and (3) the cost of creating this array for > each relation by the bottom-up method while deparsing a remote query would > be not small (especially when the query is large), so that might need more > cycles for deparsing the query than what I proposed when > use_remote_estimate=false. So, I'd like to go with what I proposed, at > least as the first cut. The arrays will need to computed only when there is at least one relation required to be deparsed as subquery. Every relation above the relation which is converted into subquery requires the array. Each array will be N * size of pointer long where N is the largest relid covered by that relation. N will vary across the RelOptInfo hierarchy. All that array holds is targetlist pointer which are required to computed anyway. So, the amount of memory is bounded by N * size of pointer * (2N - 1), which is way lesser than what we use in other places. Your patch calls isSubqueryExpr() recursively for every Var in the targetlist, which can be many for foreign tables with many columns. For every such Var it may need to reach upto the relation which is converted into subquery, which can as bad as reaching every base relation. So, it looks like the number of recursive calls to isSubqueryExpr() is bounded by V * N (i.e. worst case depth of the RelOptInfo tree), which can be quite costly. If use_remote_estimates is true, we do this for every intermediate relation and for every path created. That isn't very performance efficient. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/26 16:11, Ashutosh Bapat wrote: You wrote: For example, let's assume that a relation (1, 2, 3) is required to be deparsed as subquery for an immediate upper relation (1, 2, 3, 4, 5) (thus the other joining relation being (4,5)). While deparsing for relation (1,2,3,4,5), the context will contain a 5 element array, with positions 1, 2, 3 filled by same targetlist and alias whereas positions 4 and 5 will not be filled as those relations are not deparsed as subqueries. I wrote: Sorry, I don't understand this. In that case, the immediate upper relation (1, 2, 3, 4, 5) would need to fill the targetlist and aliases for *the join relation (1, 2, 3) somewhere*, not the targetlist and aliases for each of the component relations 1, 2, and 3, because the join relation is deparsed as a subquery. Maybe I'm missing something, though. The description above does not specify "targetlist and alias" for each of (1, 2, 3). The array in the context will have positions 1, 2, 3 filled with *same* alias and targetlist which is derived from relation (1, 2, 3). OK Let's assume in relation (1, 2, 3), (1, 3) in turn requires subquery but (2) does not. Thus the context created while deparsing (1, 2, 3) will have a 3 element array with positions 1 and 3 containing the same targetlist and alias, where as position 2 will be empty. When deparsing a Var node with varno = N and varattno = m, if the nth position in the array in the context is empty, that Var node will be deparsed as rN.. What happens when deparsing eg, a Var with varno = 2 at the topmost relation (1, 2, 3, 4, 5)? The second position of the array is empty, but the join relation (1, 2, 3) is deparsed as a subquery, so the Var should be deparsed as an alias of an output column of the subquery at the topmost relation, I think. position 2 will not be empty, it will be filled by the alias and targetlist derived from relation (1, 2, 3). OK But if that position is has alias sZ, then we search for that Var node in the targetlist and if it's found at kth position in the targetlist, we will deparse it as sZ.ck. The search in the targetlist can be performed using tlist_member, and then fetching the position by TargetEntry::resno. This does not require any recursion and thus saves stack space and some CPU cycles required for recursion. Is that true? Yes, unless you explain why is that false. OK, that would be true, I think. I guess, the arrays need to be computed only once for any relation when the query for that relation is deparsed the first time. Does this algorithm extend to the case where we consider paths for every join order? Yes, if we store the information about which of relations need subquery and which don't for every join order. Hmm. Sorry, I'm not so excited about this proposal. I think (1) that is solving a problem that hasn't been proven to be a problem, (2) that would complicate the deparser logic, and (3) the cost of creating this array for each relation by the bottom-up method while deparsing a remote query would be not small (especially when the query is large), so that might need more cycles for deparsing the query than what I proposed when use_remote_estimate=false. So, I'd like to go with what I proposed, at least as the first cut. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
> > >> For every relation that is deparsed as a subquery, we will create a >> separate context. Each such context will have an array described >> above. This array will contain the targetlist and aliases for all the >> relations, covered by that relation, that are required to be deparsed >> as subqueries. These arrays bubble up to topmost relation that is not >> required to be deparsed as subquery. When a relation is required to be >> deparsed as a subquery, its immediate upper relation sets the alias >> and targetlist chosen for that relation at all the indexes >> corresponding to all the base relations covered by that relation. > > > Thanks for the explanation! > >> For >> example, let's assume that a relation (1, 2, 3) is required to be >> deparsed as subquery for an immediate upper relation (1, 2, 3, 4, 5) >> (thus the other joining relation being (4,5)). While deparsing for >> relation (1,2,3,4,5), the context will contain a 5 element array, with >> positions 1, 2, 3 filled by same targetlist and alias whereas >> positions 4 and 5 will not be filled as those relations are not >> deparsed as subqueries. > > > Sorry, I don't understand this. In that case, the immediate upper relation > (1, 2, 3, 4, 5) would need to fill the targetlist and aliases for *the join > relation (1, 2, 3) somewhere*, not the targetlist and aliases for each of > the component relations 1, 2, and 3, because the join relation is deparsed > as a subquery. Maybe I'm missing something, though. The description above does not specify "targetlist and alias" for each of (1, 2, 3). The array in the context will have positions 1, 2, 3 filled with *same* alias and targetlist which is derived from relation (1, 2, 3). > >> Let's assume in relation (1, 2, 3), (1, 3) in >> turn requires subquery but (2) does not. Thus the context created >> while deparsing (1, 2, 3) will have a 3 element array with positions 1 >> and 3 containing the same targetlist and alias, where as position 2 >> will be empty. > > >> When deparsing a Var node with varno = N and varattno = >> m, if the nth position in the array in the context is empty, that Var >> node will be deparsed as rN.. > > > What happens when deparsing eg, a Var with varno = 2 at the topmost relation > (1, 2, 3, 4, 5)? The second position of the array is empty, but the join > relation (1, 2, 3) is deparsed as a subquery, so the Var should be deparsed > as an alias of an output column of the subquery at the topmost relation, I > think. position 2 will not be empty, it will be filled by the alias and targetlist derived from relation (1, 2, 3). > >> But if that position is has >> alias sZ, then we search for that Var node in the targetlist and if >> it's found at kth position in the targetlist, we will deparse it as >> sZ.ck. The search in the targetlist can be performed using >> tlist_member, and then fetching the position by TargetEntry::resno. > > >> This does not require any recursion and thus saves stack space and >> some CPU cycles required for recursion. > > > Is that true? Yes, unless you explain why is that false. > >> some CPU cycles required for recursion. I guess, the arrays need to be >> computed only once for any relation when the query for that relation >> is deparsed the first time. > > > Does this algorithm extend to the case where we consider paths for every > join order? Yes, if we store the information about which of relations need subquery and which don't for every join order. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/25 18:58, Ashutosh Bapat wrote: You wrote: 13. The comment below is missing the main purpose i.e. creating a a unique alias, in case the relation gets converted into a subquery. Lowest or highest relid will create a unique alias at given level of join and that would be more future proof. If we decide to consider paths for every join order, following comment will no more be true. +/* + * Set the relation index. This is defined as the position of this + * joinrel in the join_rel_list list plus the length of the rtable list. + * Note that since this joinrel is at the end of the list when we are + * called, we can get the position by list_length. + */ +fpinfo->relation_index = +list_length(root->parse->rtable) + list_length(root->join_rel_list); I wrote: I don't agree on that point. As I said before, the approach using the lowest/highest relid would make a remote query having nested subqueries difficult to read. And even if we decided to consider paths for every join order, we could define the relation_index safely, by modifying postgresGetForeignJoinPaths so that it (1) sets the relation_index the proposed way at the first time it is called and (2) avoids setting the relation_index after the first call, by checking to see fpinfo->relation_index > 0. OK. Although, the alias which contains a number, which apparently doesn't show any relation with the relation (no pun intended :)) being deparsed, won't make much sense in the deparsed query. But I am fine leaving this to the committer's judgement. Fine with me. May be you want to add an assert here making sure that relation_index is set only once. Something like Assert(fpinfo->relation_index == 0) just before setting it. Will do. 15. While deparsing every Var, we are descending the RelOptInfo hierarchy. Right. I think that not only avoids creating redundant data to find the alias of a given Var node but also would be able to find it in optimal cost. Instead of that, we should probably gather all the alias information once and store it in context as an array indexed by relid. While deparsing a Var we can get the targetlist and alias for a given relation by using var->varno as index. And then search for given Var node in the targetlist to deparse the column name by its position in the targetlist. For the relations that are not converted into subqueries, this array will not hold any information and the Var will be deparsed as it's being done today. Sorry, I don't understand this part. How does that work when creating a remote query having nested subqueries? Is that able to search for a given Var node efficiently? For every relation that is deparsed as a subquery, we will create a separate context. Each such context will have an array described above. This array will contain the targetlist and aliases for all the relations, covered by that relation, that are required to be deparsed as subqueries. These arrays bubble up to topmost relation that is not required to be deparsed as subquery. When a relation is required to be deparsed as a subquery, its immediate upper relation sets the alias and targetlist chosen for that relation at all the indexes corresponding to all the base relations covered by that relation. Thanks for the explanation! For example, let's assume that a relation (1, 2, 3) is required to be deparsed as subquery for an immediate upper relation (1, 2, 3, 4, 5) (thus the other joining relation being (4,5)). While deparsing for relation (1,2,3,4,5), the context will contain a 5 element array, with positions 1, 2, 3 filled by same targetlist and alias whereas positions 4 and 5 will not be filled as those relations are not deparsed as subqueries. Sorry, I don't understand this. In that case, the immediate upper relation (1, 2, 3, 4, 5) would need to fill the targetlist and aliases for *the join relation (1, 2, 3) somewhere*, not the targetlist and aliases for each of the component relations 1, 2, and 3, because the join relation is deparsed as a subquery. Maybe I'm missing something, though. Let's assume in relation (1, 2, 3), (1, 3) in turn requires subquery but (2) does not. Thus the context created while deparsing (1, 2, 3) will have a 3 element array with positions 1 and 3 containing the same targetlist and alias, where as position 2 will be empty. When deparsing a Var node with varno = N and varattno = m, if the nth position in the array in the context is empty, that Var node will be deparsed as rN.. What happens when deparsing eg, a Var with varno = 2 at the topmost relation (1, 2, 3, 4, 5)? The second position of the array is empty, but the join relation (1, 2, 3) is deparsed as a subquery, so the Var should be deparsed as an alias of an output column of the subquery at the topmost relation, I think. But if that position is has alias sZ, then we search for that Var node in the targetlist and if it's found at kth position in the targetlist,
Re: [HACKERS] Push down more full joins in postgres_fdw
> >> 13. The comment below is missing the main purpose i.e. creating a a unique >> alias, in case the relation gets converted into a subquery. Lowest or >> highest >> relid will create a unique alias at given level of join and that would be >> more >> future proof. If we decide to consider paths for every join order, >> following >> comment will no more be true. >> +/* >> + * Set the relation index. This is defined as the position of this >> + * joinrel in the join_rel_list list plus the length of the rtable >> list. >> + * Note that since this joinrel is at the end of the list when we are >> + * called, we can get the position by list_length. >> + */ >> +fpinfo->relation_index = >> +list_length(root->parse->rtable) + >> list_length(root->join_rel_list); > > > I don't agree on that point. As I said before, the approach using the > lowest/highest relid would make a remote query having nested subqueries > difficult to read. And even if we decided to consider paths for every join > order, we could define the relation_index safely, by modifying > postgresGetForeignJoinPaths so that it (1) sets the relation_index the > proposed way at the first time it is called and (2) avoids setting the > relation_index after the first call, by checking to see > fpinfo->relation_index > 0. OK. Although, the alias which contains a number, which apparently doesn't show any relation with the relation (no pun intended :)) being deparsed, won't make much sense in the deparsed query. But I am fine leaving this to the committer's judgement. May be you want to add an assert here making sure that relation_index is set only once. Something like Assert(fpinfo->relation_index == 0) just before setting it. > >> 14. The function name talks about non-vars but the Assert few lines below >> asserts that every node there is a Var. Need better naming. >> +reltarget_has_non_vars(RelOptInfo *foreignrel) >> +{ >> +ListCell *lc; >> + >> +foreach(lc, foreignrel->reltarget->exprs) >> +{ >> +Var *var = (Var *) lfirst(lc); >> + >> +Assert(IsA(var, Var)); >> And also an explanation for the claim >> + * Note: currently deparseExplicitTargetList can't properly handle such >> Vars. > > > Will remove this function for the reason described in #12. > >> 15. While deparsing every Var, we are descending the RelOptInfo hierarchy. > > > Right. I think that not only avoids creating redundant data to find the > alias of a given Var node but also would be able to find it in optimal cost. > >> Instead of that, we should probably gather all the alias information once >> and >> store it in context as an array indexed by relid. While deparsing a Var we >> can >> get the targetlist and alias for a given relation by using var->varno as >> index. >> And then search for given Var node in the targetlist to deparse the column >> name >> by its position in the targetlist. For the relations that are not >> converted >> into subqueries, this array will not hold any information and the Var will >> be >> deparsed as it's being done today. > > > Sorry, I don't understand this part. How does that work when creating a > remote query having nested subqueries? Is that able to search for a given > Var node efficiently? For every relation that is deparsed as a subquery, we will create a separate context. Each such context will have an array described above. This array will contain the targetlist and aliases for all the relations, covered by that relation, that are required to be deparsed as subqueries. These arrays bubble up to topmost relation that is not required to be deparsed as subquery. When a relation is required to be deparsed as a subquery, its immediate upper relation sets the alias and targetlist chosen for that relation at all the indexes corresponding to all the base relations covered by that relation. For example, let's assume that a relation (1, 2, 3) is required to be deparsed as subquery for an immediate upper relation (1, 2, 3, 4, 5) (thus the other joining relation being (4,5)). While deparsing for relation (1,2,3,4,5), the context will contain a 5 element array, with positions 1, 2, 3 filled by same targetlist and alias whereas positions 4 and 5 will not be filled as those relations are not deparsed as subqueries. Let's assume in relation (1, 2, 3), (1, 3) in turn requires subquery but (2) does not. Thus the context created while deparsing (1, 2, 3) will have a 3 element array with positions 1 and 3 containing the same targetlist and alias, where as position 2 will be empty. When deparsing a Var node with varno = N and varattno = m, if the nth position in the array in the context is empty, that Var node will be deparsed as rN.. But if that position is has alias sZ, then we search for that Var node in the targetlist and if it's found at kth position in the targetlist, we will deparse it as sZ.ck. The search in the targetlist can be performed using tlist_member, and then fetching the
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/10/22 17:19, Ashutosh Bapat wrote: Review for postgres-fdw-more-full-join-pushdown-v6 patch. The patch applies cleanly and regression is clean (make check in regress directory and that in postgres_fdw). Thanks for the review! Here are some comments. 1. Because of the following code change, for a joinrel we might end up using attrs_used, which will be garbage for a join rel. The assumption that tlist can not be NIL for a join relation is wrong. Even for a join relation, tlist can be NULL. Consider query 'explain verbose select count(*) from ft1, ft2' Since count(*) doesn't need any columns, the tlist is NIL. With the patch the server crashes for this query. -if (foreignrel->reloptkind == RELOPT_JOINREL) +/* + * Note: tlist for a base relation might be non-NIL. For example, if the + * base relation is an operand of a foreign join performing a full outer + * join and has non-NIL remote_conds, the base relation will be deparsed + * as a subquery, so the tlist for the base relation could be non-NIL. + */ +if (tlist != NIL) { -/* For a join relation use the input tlist */ We can not decide whether to use deparseExplicitTargetList or deparse it from attrs_used based on the tlist. SELECT lists, whether it's topmost SELECT or a subquery (even on a base relation), for deparsing a JOIN query need to use deparseExplicitTargetList. Will fix. 2. The code in deparseRangeTblRef() dealing with subqueries, is very similar to deparseSelectStmtForRel(). The only thing deparseRangeTblRef() does not require is the appending ORDER BY, which is determined by existence of pathkeys argument. Probably we should reuse deparseSelectStmtForRel(), instead of duplicating the code. This will also make the current changes to deparseSelectSql unnecessary. Will do. 3. Why do we need following change? The elog() just few lines above says that we expect only Var nodes. Why then we use deparseExpr() instead of deparseVar()? if (i > 0) appendStringInfoString(buf, ", "); -deparseVar(var, context); +deparseExpr((Expr *) var, context); *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1); And I get the answer for that question, somewhere down in the patch +/* + * If the given expression is an output column of a subquery-in-FROM, + * deparse the alias to the expression instead. + */ +if (IsA(node, Var)) +{ +inttabno; +intcolno; + +if (isSubqueryExpr(node, context->foreignrel, , )) +{ +appendStringInfo(context->buf, "%s%d.%s%d", + SS_TAB_ALIAS_PREFIX, tabno, + SS_COL_ALIAS_PREFIX, colno); +return; +} +} + Functionally, this code belongs to deparseVar() and not deparseExpr(). Like all other functions which handle Expr nodes, deparseExpr() is expected to be a dispatcher to the node specific function. OK, will change that way. 4. This comment is good to have there, but is unrelated to this patch. May be a separate patch? +/* Don't generate bad syntax if no columns */ 5. Changed comment doesn't say anything different from the original comment. It may have been good to have it this way to start with, but changing it now doesn't bring anything new. You seem to have just merged prologue of the function with a comment in that function. I think, this change is unnecessary. - * The function constructs ... JOIN ... ON ... for join relation. For a base - * relation it just returns schema-qualified tablename, with the appropriate - * alias if so requested. + * For a join relation the clause of the following form is appended to buf: + * ((outer relation) (inner relation) ON (joinclauses)) + * For a base relation the function just adds the schema-qualified tablename, + * with the appropriate alias if so requested. +/* Don't generate bad syntax if no columns */ 6. Change may be good but unrelated to this patch. May be material for a separate patch. There are few such changes in this function. While these changes may be good by themselves, they distract reviewer from the goal of the patch. -appendStringInfo(buf, "("); +appendStringInfoChar(buf, '('); OK on #4, #5, and #6, Will remove all the changes. I will leave those for separate patches. 7. I don't understand why you need to change this function so much. Take for example the following change. -RelOptInfo *rel_o = fpinfo->outerrel; -RelOptInfo *rel_i = fpinfo->innerrel; -StringInfoData join_sql_o; -StringInfoData join_sql_i; +/* Begin the FROM clause entry */ +appendStringInfoChar(buf, '('); /* Deparse outer relation */ -initStringInfo(_sql_o); -deparseFromExprForRel(_sql_o, root, rel_o, true, params_list); +deparseRangeTblRef(buf, root, + fpinfo->outerrel, +
Re: [HACKERS] Push down more full joins in postgres_fdw
> 11. I have reworded following comment and restructured the code that follows > it > in the attached patch. > +/* > + * Set the subquery information. If the relation performs a full outer > + * join and if the input relations have non-NIL remote_conds, the input > + * relations need to be deparsed as a subquery. > + */ > The code esp. the if .. else .. block followed by another one, made it a bit > unreadable. Hence I restructured it so that it's readable and doesn't require > variable "relids" from earlier code, which was being reused. Let me know if > this change looks good. > Sorry, forgot to attach the patch. Here it is. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 691658f..aa1f111 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -95,20 +95,22 @@ typedef struct deparse_expr_cxt PlannerInfo *root; /* global planner state */ RelOptInfo *foreignrel; /* the foreign relation we are planning for */ StringInfo buf; /* output buffer to append to */ List **params_list; /* exprs that will become remote Params */ } deparse_expr_cxt; #define REL_ALIAS_PREFIX "r" /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) +#define SS_TAB_ALIAS_PREFIX "s" +#define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on * remote server. */ static bool foreign_expr_walker(Node *node, foreign_glob_cxt *glob_cxt, foreign_loc_cxt *outer_cxt); static char *deparse_type_name(Oid type_oid, int32 typemod); @@ -145,27 +147,36 @@ static void deparseDistinctExpr(DistinctExpr *node, deparse_expr_cxt *context); static void deparseScalarArrayOpExpr(ScalarArrayOpExpr *node, deparse_expr_cxt *context); static void deparseRelabelType(RelabelType *node, deparse_expr_cxt *context); static void deparseBoolExpr(BoolExpr *node, deparse_expr_cxt *context); static void deparseNullTest(NullTest *node, deparse_expr_cxt *context); static void deparseArrayExpr(ArrayExpr *node, deparse_expr_cxt *context); static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); -static void deparseSelectSql(List *tlist, List **retrieved_attrs, +static void deparseSelectSql(List *tlist, + List *remote_conds, + List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, - RelOptInfo *joinrel, bool use_alias, List **params_list); + RelOptInfo *foreignrel, bool add_rel_alias, + List **params_list); +static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, + bool make_subquery, List **params_list); +static void appendSubselectAlias(deparse_expr_cxt *context); +static void getSubselectAliasInfo(Expr *node, RelOptInfo *foreignrel, + int *tabno, int *colno); +static bool isSubqueryExpr(Expr *node, RelOptInfo *foreignrel, int *tabno, int *colno); /* * Examine each qual clause in input_conds, and classify them into two groups, * which are returned as two lists: * - remote_conds contains expressions that can be evaluated remotely * - local_conds contains expressions that can't be evaluated remotely */ void classifyConditions(PlannerInfo *root, @@ -773,95 +784,105 @@ deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, Assert(rel->reloptkind == RELOPT_JOINREL || rel->reloptkind == RELOPT_BASEREL || rel->reloptkind == RELOPT_OTHER_MEMBER_REL); /* Fill portions of context common to join and base relation */ context.buf = buf; context.root = root; context.foreignrel = rel; context.params_list = params_list; - /* Construct SELECT clause and FROM clause */ - deparseSelectSql(tlist, retrieved_attrs, ); - - /* - * Construct WHERE clause - */ - if (remote_conds) - { - appendStringInfo(buf, " WHERE "); - appendConditions(remote_conds, ); - } + /* Construct SELECT clause, FROM clause, and WHERE clause */ + deparseSelectSql(tlist, remote_conds, retrieved_attrs, ); /* Add ORDER BY clause if we found any useful pathkeys */ if (pathkeys) appendOrderByClause(pathkeys, ); /* Add any necessary FOR UPDATE/SHARE. */ deparseLockingClause(); } /* * Construct a simple SELECT statement that retrieves desired columns * of the specified foreign table, and append it to "buf". The output - * contains just "SELECT ... FROM
Re: [HACKERS] Push down more full joins in postgres_fdw
Review for postgres-fdw-more-full-join-pushdown-v6 patch. The patch applies cleanly and regression is clean (make check in regress directory and that in postgres_fdw). Here are some comments. 1. Because of the following code change, for a joinrel we might end up using attrs_used, which will be garbage for a join rel. The assumption that tlist can not be NIL for a join relation is wrong. Even for a join relation, tlist can be NULL. Consider query 'explain verbose select count(*) from ft1, ft2' Since count(*) doesn't need any columns, the tlist is NIL. With the patch the server crashes for this query. -if (foreignrel->reloptkind == RELOPT_JOINREL) +/* + * Note: tlist for a base relation might be non-NIL. For example, if the + * base relation is an operand of a foreign join performing a full outer + * join and has non-NIL remote_conds, the base relation will be deparsed + * as a subquery, so the tlist for the base relation could be non-NIL. + */ +if (tlist != NIL) { -/* For a join relation use the input tlist */ We can not decide whether to use deparseExplicitTargetList or deparse it from attrs_used based on the tlist. SELECT lists, whether it's topmost SELECT or a subquery (even on a base relation), for deparsing a JOIN query need to use deparseExplicitTargetList. 2. The code in deparseRangeTblRef() dealing with subqueries, is very similar to deparseSelectStmtForRel(). The only thing deparseRangeTblRef() does not require is the appending ORDER BY, which is determined by existence of pathkeys argument. Probably we should reuse deparseSelectStmtForRel(), instead of duplicating the code. This will also make the current changes to deparseSelectSql unnecessary. 3. Why do we need following change? The elog() just few lines above says that we expect only Var nodes. Why then we use deparseExpr() instead of deparseVar()? if (i > 0) appendStringInfoString(buf, ", "); -deparseVar(var, context); +deparseExpr((Expr *) var, context); *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1); And I get the answer for that question, somewhere down in the patch +/* + * If the given expression is an output column of a subquery-in-FROM, + * deparse the alias to the expression instead. + */ +if (IsA(node, Var)) +{ +inttabno; +intcolno; + +if (isSubqueryExpr(node, context->foreignrel, , )) +{ +appendStringInfo(context->buf, "%s%d.%s%d", + SS_TAB_ALIAS_PREFIX, tabno, + SS_COL_ALIAS_PREFIX, colno); +return; +} +} + Functionally, this code belongs to deparseVar() and not deparseExpr(). Like all other functions which handle Expr nodes, deparseExpr() is expected to be a dispatcher to the node specific function. 4. This comment is good to have there, but is unrelated to this patch. May be a separate patch? +/* Don't generate bad syntax if no columns */ 5. Changed comment doesn't say anything different from the original comment. It may have been good to have it this way to start with, but changing it now doesn't bring anything new. You seem to have just merged prologue of the function with a comment in that function. I think, this change is unnecessary. - * The function constructs ... JOIN ... ON ... for join relation. For a base - * relation it just returns schema-qualified tablename, with the appropriate - * alias if so requested. + * For a join relation the clause of the following form is appended to buf: + * ((outer relation) (inner relation) ON (joinclauses)) + * For a base relation the function just adds the schema-qualified tablename, + * with the appropriate alias if so requested. +/* Don't generate bad syntax if no columns */ 6. Change may be good but unrelated to this patch. May be material for a separate patch. There are few such changes in this function. While these changes may be good by themselves, they distract reviewer from the goal of the patch. -appendStringInfo(buf, "("); +appendStringInfoChar(buf, '('); 7. I don't understand why you need to change this function so much. Take for example the following change. -RelOptInfo *rel_o = fpinfo->outerrel; -RelOptInfo *rel_i = fpinfo->innerrel; -StringInfoData join_sql_o; -StringInfoData join_sql_i; +/* Begin the FROM clause entry */ +appendStringInfoChar(buf, '('); /* Deparse outer relation */ -initStringInfo(_sql_o); -deparseFromExprForRel(_sql_o, root, rel_o, true, params_list); +deparseRangeTblRef(buf, root, + fpinfo->outerrel, + fpinfo->make_outerrel_subquery, + params_list); /* Deparse outer relation */ -initStringInfo(_sql_o); -deparseFromExprForRel(_sql_o, root, rel_o, true,
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/09/28 18:35, Etsuro Fujita wrote: Attached is an updated version of the patch. I found a minor bug in that patch; the relation_index added to PgFdwRelationInfo was defined as Index, but I used the modifier %d to print that. So, I changed the data type to int. Also, I added a bit more comments. Please find attached an updated version of the patch. Best regards, Etsuro Fujita postgres-fdw-more-full-join-pushdown-v5.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/09/26 16:30, Etsuro Fujita wrote: On 2016/09/13 14:17, Ashutosh Bapat wrote: It won't remain minimal as the number of paths created increases, increasing the number of times a query is deparsed. We deparse query every time time we cost a path for a relation with use_remote_estimates true. As we try to push down more and more stuff, we will create more paths and deparse the query more time. Also, that makes the interface better. Right now, in your patch, you have changed the order of deparsing in the existing code, so that the aliases are registered while deparsing FROM clause and before any Var nodes are deparsed. If we create aliases at the time of path creation, only once in GetForeignJoinPaths or GetForeignPaths as appropriate, that would require less code churn and would save some CPU cycles as well. Agreed. Will fix. Done. Attached is an updated version of the patch. I didn't create aliases at anytime. Instead, I added a logic to get info about the alias to a given expression from reltarget->exprs for relations in a given join tree. See isSubqueryExpr and getSubselectAliasInfo. As proposed by you, the patch differentiates between a base relation alias and a subquery alias by using different prefixes "r" and "s", respectively. Also, subquery aliases are indexed by RTI for baserels and the position in join_rel_list + the length of rtable for joinrels, as proposed upthread. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 102,107 typedef struct deparse_expr_cxt --- 102,109 /* Handy macro to add relation name qualification */ #define ADD_REL_QUALIFIER(buf, varno) \ appendStringInfo((buf), "%s%d.", REL_ALIAS_PREFIX, (varno)) + #define SS_TAB_ALIAS_PREFIX "s" + #define SS_COL_ALIAS_PREFIX "c" /* * Functions to determine whether an expression can be evaluated safely on *** *** 152,164 static void printRemoteParam(int paramindex, Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *joinrel, bool use_alias, List **params_list); /* --- 154,175 deparse_expr_cxt *context); static void printRemotePlaceholder(Oid paramtype, int32 paramtypmod, deparse_expr_cxt *context); ! static void deparseSelectSql(List *tlist, ! List *remote_conds, ! List **retrieved_attrs, deparse_expr_cxt *context); static void deparseLockingClause(deparse_expr_cxt *context); static void appendOrderByClause(List *pathkeys, deparse_expr_cxt *context); static void appendConditions(List *exprs, deparse_expr_cxt *context); static void deparseFromExprForRel(StringInfo buf, PlannerInfo *root, ! RelOptInfo *foreignrel, bool add_rel_alias, ! List **params_list); ! static void deparseRangeTblRef(StringInfo buf, PlannerInfo *root, RelOptInfo *foreignrel, ! bool is_subquery, List **params_list); ! static void appendSubselectAlias(deparse_expr_cxt *context); ! static void getSubselectAliasInfo(Expr *node, int *tabno, int *colno, ! RelOptInfo *foreignrel); ! static bool isSubqueryExpr(Expr *node, int *tabno, int *colno, RelOptInfo *foreignrel); /* *** *** 780,796 deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, context.foreignrel = rel; context.params_list = params_list; ! /* Construct SELECT clause and FROM clause */ ! deparseSelectSql(tlist, retrieved_attrs, ); ! ! /* ! * Construct WHERE clause ! */ ! if (remote_conds) ! { ! appendStringInfo(buf, " WHERE "); ! appendConditions(remote_conds, ); ! } /* Add ORDER BY clause if we found any useful pathkeys */ if (pathkeys) --- 791,798 context.foreignrel = rel; context.params_list = params_list; ! /* Construct SELECT clause, FROM clause, and WHERE clause */ ! deparseSelectSql(tlist, remote_conds, retrieved_attrs, ); /* Add ORDER BY clause if we found any useful pathkeys */ if (pathkeys) *** *** 803,809 deparseSelectStmtForRel(StringInfo buf, PlannerInfo *root, RelOptInfo *rel, /* * Construct a simple SELECT statement that retrieves desired columns * of the specified foreign table, and append it to "buf". The output ! * contains just "SELECT ... FROM ". * * We also create an integer List of the columns being retrieved, which is * returned to *retrieved_attrs. --- 805,811 /* * Construct a simple
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/09/27 13:33, Ashutosh Bapat wrote: I wrote: ISTM that the use of the same RTI for subqueries in multi-levels in a remote SQL makes the SQL a bit difficult to read. How about using the position of the join rel in join_rel_list, (more precisely, the position plus list_length(root->parse->rtable)), instead? I wrote: The join rel is appended to the end of the list, so I was thinking to get the position info by list_length during postgresGetForeignJoinPaths. That's true only when the paths are being added to a newly created joinrel. But that's not true always. We may add paths with different joining order to an existing joinrel, in which case list_length would not give its position. Am I missing something? I think you are right, but postgresGetForeignJoinPaths only allows us to add a foreign join path to a newly created joinrel. The reason is because that routine skips all its work after the first call for that joinrel, by checking to see if joinrel->fdw_private is not NULL. So, I think it's reasonable to get the position by list_length in that routine. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Tue, Sep 27, 2016 at 8:48 AM, Etsuro Fujitawrote: > On 2016/09/26 20:20, Ashutosh Bapat wrote: >> >> On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujita >> wrote: >>> >>> On 2016/09/26 18:06, Ashutosh Bapat wrote: On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita wrote: > > > ISTM that the use of the same RTI for subqueries in multi-levels in a > remote > SQL makes the SQL a bit difficult to read. How about using the > position > of > the join rel in join_rel_list, (more precisely, the position plus > list_length(root->parse->rtable)), instead? > > We switch to hash table to maintain the join RelOptInfos when the number of joins grows larger, where the position won't make much sense. > > >>> That's right, but we still store the joinrel into join_rel_list after >>> creating that hash table. > > >> As the list grows, it will take >> longer to locate the RelOptInfo of the given join. Doing that for >> creating an alias seems an overkill. > > > The join rel is appended to the end of the list, so I was thinking to get > the position info by list_length during postgresGetForeignJoinPaths. That's true only when the paths are being added to a newly created joinrel. But that's not true always. We may add paths with different joining order to an existing joinrel, in which case list_length would not give its position. Am I missing something? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/09/26 20:20, Ashutosh Bapat wrote: On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujitawrote: On 2016/09/26 18:06, Ashutosh Bapat wrote: On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita wrote: ISTM that the use of the same RTI for subqueries in multi-levels in a remote SQL makes the SQL a bit difficult to read. How about using the position of the join rel in join_rel_list, (more precisely, the position plus list_length(root->parse->rtable)), instead? We switch to hash table to maintain the join RelOptInfos when the number of joins grows larger, where the position won't make much sense. That's right, but we still store the joinrel into join_rel_list after creating that hash table. As the list grows, it will take longer to locate the RelOptInfo of the given join. Doing that for creating an alias seems an overkill. The join rel is appended to the end of the list, so I was thinking to get the position info by list_length during postgresGetForeignJoinPaths. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Sep 26, 2016 at 4:06 PM, Etsuro Fujitawrote: > On 2016/09/26 18:06, Ashutosh Bapat wrote: >> >> On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujita >> wrote: > > >>> ISTM that the use of the same RTI for subqueries in multi-levels in a >>> remote >>> SQL makes the SQL a bit difficult to read. How about using the position >>> of >>> the join rel in join_rel_list, (more precisely, the position plus >>> list_length(root->parse->rtable)), instead? > > >> We switch to hash table to maintain the join RelOptInfos when the >> number of joins grows larger, where the position won't make much >> sense. > > > That's right, but we still store the joinrel into join_rel_list after > creating that hash table. That hash table is just for fast lookup. See > build_join_rel. Sorry, I wasn't clear in my reply. As the list grows, it will take longer to locate the RelOptInfo of the given join. Doing that for creating an alias seems an overkill. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On 2016/09/26 18:06, Ashutosh Bapat wrote: On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujitawrote: ISTM that the use of the same RTI for subqueries in multi-levels in a remote SQL makes the SQL a bit difficult to read. How about using the position of the join rel in join_rel_list, (more precisely, the position plus list_length(root->parse->rtable)), instead? We switch to hash table to maintain the join RelOptInfos when the number of joins grows larger, where the position won't make much sense. That's right, but we still store the joinrel into join_rel_list after creating that hash table. That hash table is just for fast lookup. See build_join_rel. We might differentiate between a base relation alias and subquery alias by using different prefixes like "r" and "s" resp. Hmm. My concern about that would the recursive use of "s" with the same RTI as subquery aliases for different level subqueries in a single remote SQL. For example, if those subqueries include a base rel with RTI=1, then "s1" would be used again and again within that SQL. That would be logically correct, but seems confusing to me. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Push down more full joins in postgres_fdw
On Mon, Sep 26, 2016 at 1:05 PM, Etsuro Fujitawrote: > On 2016/09/15 15:29, Ashutosh Bapat wrote: >> >> On Wed, Sep 14, 2016 at 8:52 PM, Robert Haas >> wrote: > > >>> I'm not sure why it wouldn't work >>> to just use the lowest RTI involved in the join, though; the others >>> won't appear anywhere else at that query level. > > >> So +1 for >> using the smallest RTI to indicate a subquery. > > > +1 for the general idea. > > ISTM that the use of the same RTI for subqueries in multi-levels in a remote > SQL makes the SQL a bit difficult to read. How about using the position of > the join rel in join_rel_list, (more precisely, the position plus > list_length(root->parse->rtable)), instead? > We switch to hash table to maintain the join RelOptInfos when the number of joins grows larger, where the position won't make much sense. We might differentiate between a base relation alias and subquery alias by using different prefixes like "r" and "s" resp. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers