Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Tue, Nov 26, 2019 at 12:37 PM Etsuro Fujita wrote: > I was planning to work on this in this commitfest, but sorry, I didn't > have time due to other priorities. Probably, I won't have time for > this in the development cycle for v13. So I'll mark this as RWF, > unless anyone wants to work on it. Done. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hi Michael-san, On Mon, Nov 25, 2019 at 4:13 PM Michael Paquier wrote: > On Tue, Sep 03, 2019 at 12:37:52AM +0900, Etsuro Fujita wrote: > > On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita > > wrote: > >> This is my TODO item for PG13, but I'll give priority to other things > >> in the next commitfest. If anyone wants to work on it, feel free; > >> else I'll move this to the November commitfest when it opens. > > > > Moved. > > This has been waiting on author for two commit fests now, and the > situation has not changed. Fujita-san, Hiriguchi-san, do you have an > update to provide? There is no meaning to keep the current stale > situation for more CFs. I was planning to work on this in this commitfest, but sorry, I didn't have time due to other priorities. Probably, I won't have time for this in the development cycle for v13. So I'll mark this as RWF, unless anyone wants to work on it. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Tue, Sep 03, 2019 at 12:37:52AM +0900, Etsuro Fujita wrote: > On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita > wrote: >> This is my TODO item for PG13, but I'll give priority to other things >> in the next commitfest. If anyone wants to work on it, feel free; >> else I'll move this to the November commitfest when it opens. > > Moved. This has been waiting on author for two commit fests now, and the situation has not changed. Fujita-san, Hiriguchi-san, do you have an update to provide? There is no meaning to keep the current stale situation for more CFs. -- Michael signature.asc Description: PGP signature
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Wed, Aug 14, 2019 at 11:51 AM Etsuro Fujita wrote: > This is my TODO item for PG13, but I'll give priority to other things > in the next commitfest. If anyone wants to work on it, feel free; > else I'll move this to the November commitfest when it opens. Moved. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hi Alvaro and Michael, On Tue, Aug 13, 2019 at 11:04 PM Alvaro Herrera wrote: > On 2019-Aug-13, Michael Paquier wrote: > > On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote: > > > So do we have an updated patch for this? It's been a while since this > > > patch saw any movement ... Thanks for reminding me about this, Alvaro! > > Please note that this involves a couple of people in Japan, and this > > week is the Obon vacation season for a lot of people. So there could > > be delays in replies. Yeah, I was on that vacation. Thanks, Michael! > Understood, thanks for the info. We still have two weeks to the start > of commitfest anyway. And since it's been sleeping since November 2018, > I guess we can wait a little bit yet. This is my TODO item for PG13, but I'll give priority to other things in the next commitfest. If anyone wants to work on it, feel free; else I'll move this to the November commitfest when it opens. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On 2019-Aug-13, Michael Paquier wrote: > On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote: > > So do we have an updated patch for this? It's been a while since this > > patch saw any movement ... > > Please note that this involves a couple of people in Japan, and this > week is the Obon vacation season for a lot of people. So there could > be delays in replies. Understood, thanks for the info. We still have two weeks to the start of commitfest anyway. And since it's been sleeping since November 2018, I guess we can wait a little bit yet. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Mon, Aug 12, 2019 at 05:32:08PM -0400, Alvaro Herrera wrote: > So do we have an updated patch for this? It's been a while since this > patch saw any movement ... Please note that this involves a couple of people in Japan, and this week is the Obon vacation season for a lot of people. So there could be delays in replies. -- Michael signature.asc Description: PGP signature
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On 2018-Nov-16, Tom Lane wrote: > Etsuro Fujita writes: > > [ fix-foreign-modify-efujita-2.patch ] > > Um ... wow, I do not like anything about this. Adding a "tableoid = X" > constraint to every remote update query seems awfully expensive, > considering that (a) it's useless for non-partitioned tables, and > (b) the remote planner will have exactly no intelligence about handling > it. We could improve (b) probably, but that'd be another big chunk of > work, and it wouldn't help when talking to older servers. So do we have an updated patch for this? It's been a while since this patch saw any movement ... -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2019/02/08 10:09), Michael Paquier wrote: On Thu, Feb 07, 2019 at 09:55:18PM +0900, Etsuro Fujita wrote: I 100% agree with Tom, and actually, I tried to address his comments, but I haven't come up with a clear solution for that yet. I really want to address this, but I won't have much time to work on that at least until after this development cycle, so what I'm thinking is to mark this as Returned with feedback, or if possible, to move this to the 2019-07 CF. Simply marking it as returned with feedback does not seem adapted to me as we may lose track of it. Moving it to the future CF would make more sense in my opinion. OK, I have moved this to the 2019-07 CF, keeping Waiting on Author. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, Feb 07, 2019 at 09:55:18PM +0900, Etsuro Fujita wrote: > I 100% agree with Tom, and actually, I tried to address his comments, but I > haven't come up with a clear solution for that yet. I really want to > address this, but I won't have much time to work on that at least until > after this development cycle, so what I'm thinking is to mark this as > Returned with feedback, or if possible, to move this to the 2019-07 CF. Simply marking it as returned with feedback does not seem adapted to me as we may lose track of it. Moving it to the future CF would make more sense in my opinion. -- Michael signature.asc Description: PGP signature
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2019/02/02 10:21), Michael Paquier wrote: On Fri, Nov 16, 2018 at 01:35:15PM -0500, Tom Lane wrote: I wonder whether we'd be better off thinking of a way to let FDWs invent additional system column IDs for their tables, so that something like a remote table OID could be represented in the natural way as a Var with negative varattno. This'd potentially also be a win for FDWs whose underlying storage has a row identifier, but it's not of type "tid". Instead of trying to shoehorn their row ID into SelfItemPointerAttributeNumber, they could define a new system column that has a more appropriate data type. Admittedly there'd be some infrastructure work to do to make this happen, maybe a lot of it; but it's a bullet we really need to bite at some point. This patch got zero input for the last couple of months. As it is classified as bug fix, I have moved it to next CF, waiting on author. Fujita-san, are you planning to look at it? I 100% agree with Tom, and actually, I tried to address his comments, but I haven't come up with a clear solution for that yet. I really want to address this, but I won't have much time to work on that at least until after this development cycle, so what I'm thinking is to mark this as Returned with feedback, or if possible, to move this to the 2019-07 CF. My apologies for the late reply. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Fri, Nov 16, 2018 at 01:35:15PM -0500, Tom Lane wrote: > I wonder whether we'd be better off thinking of a way to let FDWs > invent additional system column IDs for their tables, so that > something like a remote table OID could be represented in the > natural way as a Var with negative varattno. This'd potentially > also be a win for FDWs whose underlying storage has a row identifier, > but it's not of type "tid". Instead of trying to shoehorn their > row ID into SelfItemPointerAttributeNumber, they could define a > new system column that has a more appropriate data type. Admittedly > there'd be some infrastructure work to do to make this happen, maybe > a lot of it; but it's a bullet we really need to bite at some point. This patch got zero input for the last couple of months. As it is classified as bug fix, I have moved it to next CF, waiting on author. Fujita-san, are you planning to look at it? -- Michael signature.asc Description: PGP signature
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Etsuro Fujita writes: > [ fix-foreign-modify-efujita-2.patch ] Um ... wow, I do not like anything about this. Adding a "tableoid = X" constraint to every remote update query seems awfully expensive, considering that (a) it's useless for non-partitioned tables, and (b) the remote planner will have exactly no intelligence about handling it. We could improve (b) probably, but that'd be another big chunk of work, and it wouldn't help when talking to older servers. (Admittedly, I'm not sure I have a better idea. If we knew which remote tables were partitioned, we could avoid sending unnecessary tableoid constraints; but we don't have any good way to track that.) I think the proposed hacks on the planner's Param handling are a mess as well. You can't go and change the contents of a Param node sometime after creating it --- that will for example break equal() comparisons that might be done in between. (No, I don't buy that you know exactly what will be done in between.) The cost of what you've added to join tlist creation and setrefs processing seems unduly high, too. I wonder whether we'd be better off thinking of a way to let FDWs invent additional system column IDs for their tables, so that something like a remote table OID could be represented in the natural way as a Var with negative varattno. This'd potentially also be a win for FDWs whose underlying storage has a row identifier, but it's not of type "tid". Instead of trying to shoehorn their row ID into SelfItemPointerAttributeNumber, they could define a new system column that has a more appropriate data type. Admittedly there'd be some infrastructure work to do to make this happen, maybe a lot of it; but it's a bullet we really need to bite at some point. regards, tom lane
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/10/02 21:16), Etsuro Fujita wrote: Attached is an updated version of the patch. Changes: That patch conflicts the recent executor changes, so I'm attaching a rebased patch, in which I also added a fast path to add_params_to_result_rel and did some comment editing for consistency. I'll add this to the next CF so that it does not get lost. Best regards, Etsuro Fujita *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,135 static void deparseTargetList(StringInfo buf, --- 130,136 Relation rel, bool is_returning, Bitmapset *attrs_used, + bool tableoid_needed, bool qualify_col, List **retrieved_attrs); static void deparseExplicitTargetList(List *tlist, *** *** 901,906 build_tlist_to_deparse(RelOptInfo *foreignrel) --- 902,926 PVC_RECURSE_PLACEHOLDERS)); } + /* Also, add the Param representing the remote table OID, if it exists. */ + if (fpinfo->tableoid_param) + { + TargetEntry *tle; + + /* + * Core code should have contained the Param in the given relation's + * reltarget. + */ + Assert(list_member(foreignrel->reltarget->exprs, + fpinfo->tableoid_param)); + + tle = makeTargetEntry((Expr *) copyObject(fpinfo->tableoid_param), + list_length(tlist) + 1, + NULL, + false); + tlist = lappend(tlist, tle); + } + return tlist; } *** *** 1052,1058 deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, Relation rel = heap_open(rte->relid, NoLock); deparseTargetList(buf, rte, foreignrel->relid, rel, false, ! fpinfo->attrs_used, false, retrieved_attrs); heap_close(rel, NoLock); } } --- 1072,1080 Relation rel = heap_open(rte->relid, NoLock); deparseTargetList(buf, rte, foreignrel->relid, rel, false, ! fpinfo->attrs_used, ! fpinfo->tableoid_param ? true : false, ! false, retrieved_attrs); heap_close(rel, NoLock); } } *** *** 1093,1098 deparseFromExpr(List *quals, deparse_expr_cxt *context) --- 1115,1122 * This is used for both SELECT and RETURNING targetlists; the is_returning * parameter is true only for a RETURNING targetlist. * + * For SELECT, the target list contains remote tableoid if tableoid_needed. + * * The tlist text is appended to buf, and we also create an integer List * of the columns being retrieved, which is returned to *retrieved_attrs. * *** *** 1105,1110 deparseTargetList(StringInfo buf, --- 1129,1135 Relation rel, bool is_returning, Bitmapset *attrs_used, + bool tableoid_needed, bool qualify_col, List **retrieved_attrs) { *** *** 1146,1152 deparseTargetList(StringInfo buf, /* * Add ctid and oid if needed. We currently don't support retrieving any ! * other system columns. */ if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, attrs_used)) --- 1171,1177 /* * Add ctid and oid if needed. We currently don't support retrieving any ! * other system columns, except tableoid, which is retrieved if required. */ if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, attrs_used)) *** *** 1180,1185 deparseTargetList(StringInfo buf, --- 1205,1224 *retrieved_attrs = lappend_int(*retrieved_attrs, ObjectIdAttributeNumber); } + if (tableoid_needed) + { + Assert(bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, + attrs_used)); + Assert(!first); + Assert(!is_returning); + Assert(!qualify_col); + + appendStringInfoString(buf, ", "); + appendStringInfoString(buf, "tableoid"); + + *retrieved_attrs = lappend_int(*retrieved_attrs, + TableOidAttributeNumber); + } /* Don't generate bad syntax if no undropped columns */ if (first && !is_returning) *** *** 1728,1734 deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, deparseRelation(buf, rel); appendStringInfoString(buf, " SET "); ! pindex = 2; /* ctid is always the first param */ first = true; foreach(lc, targetAttrs) { --- 1767,1774 deparseRelation(buf, rel); appendStringInfoString(buf, " SET "); ! pindex = 3; /* ctid and tableoid are always the two ! * leading params */ first = true; foreach(lc, targetAttrs) { *** *** 1742,1748 deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, appendStringInfo(buf, " = $%d", pindex); pindex++; } ! appendStringInfoString(buf, " WHERE ctid = $1"); deparseReturningList(buf, rte, rtindex, rel, rel->trigdesc && rel->trigdesc->trig_update_after_row, --- 1782,1788 appendStringInfo(buf, " = $%d", pindex);
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/09/21 20:03), Etsuro Fujita wrote: (2018/09/18 21:14), Kyotaro HORIGUCHI wrote: At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita wrote in<5b9bb133.1060...@lab.ntt.co.jp> I wrote a patch using the Param-based approach, and compared the two approaches. I don't think there would be any warts as discussed above in the Param-based approach for now. (That approach modifies the planner so that the targetrel's tlist would contain Params as well as Vars/PHVs, so actually, it breaks the planner assumption that a rel's tlist would only include Vars/PHVs, but I don't find any issues on that at least for now. Will look into that in more detail.) I spent quite a bit of time looking into that, but I couldn't find any issues, including ones discussed in [1]: * In contrib/postgres_fdw, the patch does the special handling of the Param representing the remote table OID in deparsing a remote SELECT query and building fdw_scan_tlist, but it wouldn't need the pull_var_clause change as proposed in [1]. And ISTM that that handling would be sufficient to avoid errors like 'variable not found in subplan target lists' as in [1]. * Params as extra target expressions can never be used as Pathkeys or something like that, so it seems unlikely that that approach would cause 'could not find pathkey item to sort' errors in prepare_sort_from_pathkeys() as in [1]. * I checked other parts of the planner such as subselect.c and setrefs.c, but I couldn't find any issues. What do you think about that? I confirmed that a FOREIGN_PARAM_EXEC is evaluated and stored into the parent node. For the mentioned Merge/Sort/ForeignScan case, Sort node takes the parameter value via projection. I didn't know PARAM_EXEC works that way. I consulted nodeNestLoop but not fully understood. So I think it works. I still don't think expanded tupledesc is not wart but this is smarter than that. Addition to that, it seems back-patchable. I must admit that yours is better. I also think that approach would be back-patchable to PG9.3, where contrib/postgres_fdw landed with the writable functionality, so I'm inclined to vote for the Param-based approach. Attached is an updated version of the patch. Changes: * Added this to use_physical_tlist(): One thing I noticed is: in any approach, I think use_physical_tlist needs to be modified so that it disables doing build_physical_tlist for a foreign scan in the case where the FDW added resjunk columns for UPDATE/DELETE that are different from user/system columns of the foreign table; else such columns would not be emitted from the foreign scan. * Fixed a bug in conversion_error_callback() in contrib/postgres_fdw.c * Simplified your contrib/postgres_fdw.c tests as discussed * Revise code/comments a bit * Added docs to fdwhandler.sgml * Rebased the patch against the latest HEAD Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/flat/cakcux6ktu-8teflwtquuzbyfaza83vuzurd7c1yhc-yewyy...@mail.gmail.com *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** *** 130,135 static void deparseTargetList(StringInfo buf, --- 130,136 Relation rel, bool is_returning, Bitmapset *attrs_used, + bool tableoid_needed, bool qualify_col, List **retrieved_attrs); static void deparseExplicitTargetList(List *tlist, *** *** 901,906 build_tlist_to_deparse(RelOptInfo *foreignrel) --- 902,926 PVC_RECURSE_PLACEHOLDERS)); } + /* Also, add the Param representing the remote table OID, if it exists. */ + if (fpinfo->tableoid_param) + { + TargetEntry *tle; + + /* + * Core code should have contained the Param in the given relation's + * reltarget. + */ + Assert(list_member(foreignrel->reltarget->exprs, + fpinfo->tableoid_param)); + + tle = makeTargetEntry((Expr *) copyObject(fpinfo->tableoid_param), + list_length(tlist) + 1, + NULL, + false); + tlist = lappend(tlist, tle); + } + return tlist; } *** *** 1052,1058 deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, Relation rel = heap_open(rte->relid, NoLock); deparseTargetList(buf, rte, foreignrel->relid, rel, false, ! fpinfo->attrs_used, false, retrieved_attrs); heap_close(rel, NoLock); } } --- 1072,1080 Relation rel = heap_open(rte->relid, NoLock); deparseTargetList(buf, rte, foreignrel->relid, rel, false, ! fpinfo->attrs_used, ! fpinfo->tableoid_param ? true : false, ! false, retrieved_attrs); heap_close(rel, NoLock); } } *** *** 1093,1098 deparseFromExpr(List *quals, deparse_expr_cxt *context) --- 1115,1122 * This is used for both SELECT and RETURNING targetlists; the is_returning * parameter is true only for a RETURNING targetlist. * + * For SELECT, the target list contains remote
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/09/18 21:14), Kyotaro HORIGUCHI wrote: At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita wrote in<5b9bb133.1060...@lab.ntt.co.jp> @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId,\ bool inhparent, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary or unlogged relations during r\ ecovery"))); + max_attrnum = RelationGetNumberOfAttributes(relation); + + /* Foreign table may have exanded this relation with junk columns */ + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE) + { + AttrNumber maxattno = max_varattno(root->parse->targetList, varno); + if (max_attrnum< maxattno) + max_attrnum = maxattno; + } + rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1; - rel->max_attr = RelationGetNumberOfAttributes(relation); + rel->max_attr = max_attrnum; rel->reltablespace = RelationGetForm(relation)->reltablespace; This breaks the fundamental assumption that rel->max_attr is equal to RelationGetNumberOfAttributes of that table. My concern is: this change would probably be a wart, so it would be bug-prone in future versions. Hmm. I believe that once RelOptInfo is created all attributes defined in it is safely accessed. Is it a wrong assumption? The patch you proposed seems to fix the issue well for the current version of PG, but I'm a bit scared to have such an assumption (ie, to include columns in a rel's tlist that are not defined anywhere in the system catalogs). In future we might add eg, a lsyscache.c routine for some planning use that are given the attr number of a column as an argument, like get_attavgwidth, and if so, it would be easily conceivable that that routine would error out for such an undefined column. (get_attavgwidth would return 0, not erroring out, though.) Actually RelationGetNumberOfAttributes is used in few distinct places while planning. build_physical_tlist is not used for foreign relations. For UPDATE/DELETE, that function would not be called for a foreign target in the posetgres_fdw case, as CTID is requested (see use_physical_tlist), but otherwise that function may be called if possible. No? If we don't accept the expanded tupdesc for base relations, the another way I can find is transforming the foreign relation into something another like a subquery, or allowing expansion of attribute list of a base relation... Sorry, I don't understand this fully, but there seems to be the same concern as mentioned above. Another thing on the new version: @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel) relation = heap_open(rte->relid, NoLock); numattrs = RelationGetNumberOfAttributes(relation); + + /* + * Foreign tables may have expanded with some junk columns. Punt +* in the case. ... I think this would disable the optimization on projection in foreign scans, causing performance regression. Well, in update/delete cases, create_plan_recurse on foreign scan is called with CP_EXACT_TLIST in create_modifytable_plan That's not necessarily true; consider UPDATE/DELETE on a local join; in that case the topmost plan node for a subplan of a ModifyTable would be a join, and if that's a NestLoop, create_plan_recurse would call create_nestloop_plan, which would recursively call create_plan_recurse for its inner/outer subplans with flag=0, not CP_EXACT_TLIST. so the code path is not actually used. I think this is true for the postgres_fdw case; because use_physical_tlist would decide not to do build_physical_tlist for the reason mentioned above. BUT my question here is: why do we need the change to build_physical_tlist? Since this uses fdw_scan_tlist so it is theoretically back-patchable back to 9.6. IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join pushdown infrastructure, so I think your patch can be back-patched to PG9.5, but I don't think that's enough; IIRC, this issue was introduced in PG9.3, so a solution for this should be back-patch-able to PG9.3, I think. In the previous version, fdw_scan_tlist is used to hold only additional (junk) columns. I think that we can get rid of the variable by scanning the full tlist for junk columns. Apparently it's differnt patch for such versions. I'm not sure how much it is invasive for now but will consider. Sorry, I don't fully understand this. Could you elaborate a bit more on this? 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch This should fail for unpatched postgres_fdw. (Just for demonstration) +CREATE TABLE p1 (a int, b int); +CREATE TABLE c1 (LIKE p1) INHERITS (p1); +CREATE TABLE c2 (LIKE p1) INHERITS (p1); +CREATE FOREIGN TABLE fp1 (a int, b int) + SERVER loopback OPTIONS (table_name 'p1'); +INSERT INTO c1 VALUES (0, 1); +INSERT INTO c2 VALUES (1, 1); +SELECT tableoid::int - (SELECT min(tableoid) FROM
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita wrote in <5b9bb133.1060...@lab.ntt.co.jp> > (2018/08/24 16:58), Kyotaro HORIGUCHI wrote: > > At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro > > HORIGUCHI wrote > > in<20180821.110132.261184472.horiguchi.kyot...@lab.ntt.co.jp> > >>> You wrote: > Several places seems to be assuming that fdw_scan_tlist may be > used foreign scan on simple relation but I didn't find that > actually happens. > >>> > >>> Yeah, currently, postgres_fdw and file_fdw don't use that list for > >>> simple foreign table scans, but it could be used to improve the > >>> efficiency for those scans, as explained in fdwhandler.sgml: > > ... > >> I'll put more consideration on using fdw_scan_tlist in the > >> documented way. > > > > Done. postgres_fdw now generates full fdw_scan_tlist (as > > documented) for foreign relations with junk columns having a > > small change in core side. However it is far less invasive than > > the previous version and I believe that it dones't harm > > maybe-existing use of fdw_scan_tlist on non-join rels (that is, > > in the case of a subset of relation columns). > > Yeah, changes to the core by the new version is really small, which is > great, but I'm not sure it's a good idea to modify the catalog info on > the target table on the fly: > > @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid > relationObjectId,\ > bool inhparent, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("cannot access temporary or unlogged relations during > r\ > ecovery"))); > > + max_attrnum = RelationGetNumberOfAttributes(relation); > + > + /* Foreign table may have exanded this relation with junk columns */ > + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE) > + { > + AttrNumber maxattno = max_varattno(root->parse->targetList, varno); > + if (max_attrnum < maxattno) > + max_attrnum = maxattno; > + } > + > rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1; > - rel->max_attr = RelationGetNumberOfAttributes(relation); > + rel->max_attr = max_attrnum; > rel->reltablespace = RelationGetForm(relation)->reltablespace; > > This breaks the fundamental assumption that rel->max_attr is equal to > RelationGetNumberOfAttributes of that table. My concern is: this > change would probably be a wart, so it would be bug-prone in future > versions. Hmm. I believe that once RelOptInfo is created all attributes defined in it is safely accessed. Is it a wrong assumption? Actually RelationGetNumberOfAttributes is used in few distinct places while planning. expand_targetlist uses it to scan the source relation's nonjunk attributes. get_rel_data_width uses it to scan width of attributes in statistics. It fails to add junk's width but it dones't harm so much.. build_physical_tlist is not used for foreign relations. build_path_tlist creates a tlist without proper resjunk flags but create_modifytable_plan immediately fixes that. If we don't accept the expanded tupdesc for base relations, the another way I can find is transforming the foreign relation into something another like a subquery, or allowing expansion of attribute list of a base relation... > Another thing on the new version: > > @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, > RelOptInfo *rel) > relation = heap_open(rte->relid, NoLock); > > numattrs = RelationGetNumberOfAttributes(relation); > + > + /* > + * Foreign tables may have expanded with some junk columns. Punt > +* in the case. ... > I think this would disable the optimization on projection in foreign > scans, causing performance regression. Well, in update/delete cases, create_plan_recurse on foreign scan is called with CP_EXACT_TLIST in create_modifytable_plan so the code path is not actually used. Just replacing the if clause with Assert seems to change nothing. I'm not sure we will add junks in other cases but it's not likely.. > > One arguable behavior change is about wholrow vars. Currently it > > refferes local tuple with all columns but it is explicitly > > fetched as ROW() after this patch applied. This could be fixed > > but not just now. > > > > Part of 0004-: > > - Output: f1, ''::text, ctid, rem1.* > > - Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE > > + Output: f1, ''::text, tableoid, ctid, rem1.* > > + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 > > FOR UPDATE > > That would be also performance regression. If we go in this > direction, that should be fixed. Agreed. Will consider sooner.. > > Since this uses fdw_scan_tlist so it is theoretically > > back-patchable back to 9.6. > > IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join > pushdown infrastructure, so I think your patch can be back-patched to > PG9.5, but I don't think that's enough; IIRC, this issue
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/24 16:58), Kyotaro HORIGUCHI wrote: At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in<20180821.110132.261184472.horiguchi.kyot...@lab.ntt.co.jp> You wrote: Several places seems to be assuming that fdw_scan_tlist may be used foreign scan on simple relation but I didn't find that actually happens. Yeah, currently, postgres_fdw and file_fdw don't use that list for simple foreign table scans, but it could be used to improve the efficiency for those scans, as explained in fdwhandler.sgml: ... I'll put more consideration on using fdw_scan_tlist in the documented way. Done. postgres_fdw now generates full fdw_scan_tlist (as documented) for foreign relations with junk columns having a small change in core side. However it is far less invasive than the previous version and I believe that it dones't harm maybe-existing use of fdw_scan_tlist on non-join rels (that is, in the case of a subset of relation columns). Yeah, changes to the core by the new version is really small, which is great, but I'm not sure it's a good idea to modify the catalog info on the target table on the fly: @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId,\ bool inhparent, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary or unlogged relations during r\ ecovery"))); + max_attrnum = RelationGetNumberOfAttributes(relation); + + /* Foreign table may have exanded this relation with junk columns */ + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE) + { + AttrNumber maxattno = max_varattno(root->parse->targetList, varno); + if (max_attrnum < maxattno) + max_attrnum = maxattno; + } + rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1; - rel->max_attr = RelationGetNumberOfAttributes(relation); + rel->max_attr = max_attrnum; rel->reltablespace = RelationGetForm(relation)->reltablespace; This breaks the fundamental assumption that rel->max_attr is equal to RelationGetNumberOfAttributes of that table. My concern is: this change would probably be a wart, so it would be bug-prone in future versions. Another thing on the new version: @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel) relation = heap_open(rte->relid, NoLock); numattrs = RelationGetNumberOfAttributes(relation); + + /* +* Foreign tables may have expanded with some junk columns. Punt +* in the case. +*/ + if (numattrs < rel->max_attr) + { + Assert(root->simple_rte_array[rel->relid]->relkind == + RELKIND_FOREIGN_TABLE); + heap_close(relation, NoLock); + break; + } I think this would disable the optimization on projection in foreign scans, causing performance regression. One arguable behavior change is about wholrow vars. Currently it refferes local tuple with all columns but it is explicitly fetched as ROW() after this patch applied. This could be fixed but not just now. Part of 0004-: - Output: f1, ''::text, ctid, rem1.* - Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE + Output: f1, ''::text, tableoid, ctid, rem1.* + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE That would be also performance regression. If we go in this direction, that should be fixed. Since this uses fdw_scan_tlist so it is theoretically back-patchable back to 9.6. IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join pushdown infrastructure, so I think your patch can be back-patched to PG9.5, but I don't think that's enough; IIRC, this issue was introduced in PG9.3, so a solution for this should be back-patch-able to PG9.3, I think. Please find the attached three files. Thanks for the patches! 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch This should fail for unpatched postgres_fdw. (Just for demonstration) +CREATE TABLE p1 (a int, b int); +CREATE TABLE c1 (LIKE p1) INHERITS (p1); +CREATE TABLE c2 (LIKE p1) INHERITS (p1); +CREATE FOREIGN TABLE fp1 (a int, b int) + SERVER loopback OPTIONS (table_name 'p1'); +INSERT INTO c1 VALUES (0, 1); +INSERT INTO c2 VALUES (1, 1); +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1; Does it make sense to evaluate toiddiff? I think that should always be 0. 0003-Fix-of-foreign-update-bug-of-PgFDW.patch Fix of postgres_fdw for this problem. Sorry, I have not looked at it closely yet, but before that I'd like to discuss the direction we go in. I'm not convinced that your approach is the right direction, so as promised, I wrote a patch using the Param-based approach, and compared the two approaches. Attached is a WIP patch for that, which includes the 0003 patch. I don't think there would be any
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Wed, 05 Sep 2018 20:02:04 +0900, Etsuro Fujita wrote in <5b8fb7ac.5020...@lab.ntt.co.jp> > (2018/08/30 21:58), Etsuro Fujita wrote: > > (2018/08/30 20:37), Kyotaro HORIGUCHI wrote: > >> At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro > >> Fujita wrote > >> in<5b7ffdef.6020...@lab.ntt.co.jp> > >>> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: > At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro > Fujita wrote > in<5b72c1ae.8010...@lab.ntt.co.jp> > > (2018/08/09 22:04), Etsuro Fujita wrote: > >> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: > >>> > > I spent more time looking at the patch. ISTM that the patch well > > suppresses the effect of the tuple-descriptor expansion by making > > changes to code in the planner and executor (and ruleutils.c), but I'm > > still not sure that the patch is the right direction to go in, because > > ISTM that expanding the tuple descriptor on the fly might be a wart. > >>> > The exapansion should be safe if the expanded descriptor has the > same defitions for base columns and all the extended coulumns are > junks. The junk columns should be ignored by unrelated nodes and > they are passed safely as far as ForeignModify passes tuples as > is from underlying ForeignScan to ForeignUpdate/Delete. > >>> > >>> I'm not sure that would be really safe. Does that work well when > >>> EvalPlanQual, for example? > > I was wrong here; I assumed here that we supported late locking for an > UPDATE or DELETE on a foreign table, and I was a bit concerned that > the approach you proposed might not work well with EvalPlanQual, but > as described in fdwhandler.sgml, the core doesn't support for that: > > For an UPDATE or DELETE on a > foreign table, it > is recommended that the ForeignScan operation on > the target > table perform early locking on the rows that it fetches, perhaps via > the > equivalent of SELECT FOR UPDATE. An FDW can detect > whether > a table is an UPDATE/DELETE > target at plan time > by comparing its relid to > root-parse-resultRelation, > or at execution time by using > ExecRelationIsTargetRelation(). > An alternative possibility is to perform late locking within the > ExecForeignUpdate or > ExecForeignDelete > callback, but no special support is provided for this. > > So, there would be no need to consider about EvalPlanQual. Sorry for > the noise. I don't think it is a noise at all. Thank you for the pointer. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/30 21:58), Etsuro Fujita wrote: (2018/08/30 20:37), Kyotaro HORIGUCHI wrote: At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita wrote in<5b7ffdef.6020...@lab.ntt.co.jp> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita wrote in<5b72c1ae.8010...@lab.ntt.co.jp> (2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. The exapansion should be safe if the expanded descriptor has the same defitions for base columns and all the extended coulumns are junks. The junk columns should be ignored by unrelated nodes and they are passed safely as far as ForeignModify passes tuples as is from underlying ForeignScan to ForeignUpdate/Delete. I'm not sure that would be really safe. Does that work well when EvalPlanQual, for example? I was wrong here; I assumed here that we supported late locking for an UPDATE or DELETE on a foreign table, and I was a bit concerned that the approach you proposed might not work well with EvalPlanQual, but as described in fdwhandler.sgml, the core doesn't support for that: For an UPDATE or DELETE on a foreign table, it is recommended that the ForeignScan operation on the target table perform early locking on the rows that it fetches, perhaps via the equivalent of SELECT FOR UPDATE. An FDW can detect whether a table is an UPDATE/DELETE target at plan time by comparing its relid to root-parse-resultRelation, or at execution time by using ExecRelationIsTargetRelation(). An alternative possibility is to perform late locking within the ExecForeignUpdate or ExecForeignDelete callback, but no special support is provided for this. So, there would be no need to consider about EvalPlanQual. Sorry for the noise. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/30 20:37), Kyotaro HORIGUCHI wrote: At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita wrote in<5b7ffdef.6020...@lab.ntt.co.jp> (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita wrote in<5b72c1ae.8010...@lab.ntt.co.jp> (2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. The exapansion should be safe if the expanded descriptor has the same defitions for base columns and all the extended coulumns are junks. The junk columns should be ignored by unrelated nodes and they are passed safely as far as ForeignModify passes tuples as is from underlying ForeignScan to ForeignUpdate/Delete. I'm not sure that would be really safe. Does that work well when EvalPlanQual, for example? Nothing. The reason was that core just doesn't know about the extended portion. So only problematic case was ExprEvalWholeRowVar, where explicit sanity check is perfomed. But, I think it is a ugly wart as you said. So the latest patch generates full fdw_scan_tlist. Will review. If we take the Param-based approach suggested by Tom, I suspect there would be no need to worry about at least those things, so I'll try to update your patch as such, if there are no objections from you (or anyone else). PARAM_EXEC is single storage side channel that can work as far as it is set and read while each tuple is handled. In this case postgresExecForeignUpdate/Delete must be called before postgresIterateForeignScan returns the next tuple. An apparent failure case for this usage is the join-update case below. https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyot...@lab.ntt.co.jp What I have in mind would be to 1) create a tlist that contains not only Vars/PHVs but Params, for each join rel involving the target rel so we ensure that the Params will propagate up through all join plan steps, and 2) convert a join rel's tlist Params into Vars referencing the same Params in the tlists for the outer/inner rels, by setrefs.c. I think that would probably work well even for the case you mentioned above. Maybe I'm missing something, though. As I wrote above, the problem was not param id propagation but the per-query storage for a parameter holded in econtext. PARAM_EXEC is assumed to be used between outer and inner relations of a nestloop or retrieval from sub-query retrieval as commented in primnodes.h. PARAM_EXEC: The parameter is an internal executor parameter, used for passing values into and out of sub-queries or from nestloop joins to their inner scans. For historical reasons, such parameters are numbered from 0. These numbers are independent of PARAM_EXTERN numbers. Yeah, but IIUC, I think that #2 would allow us to propagate up the param values, not the param ids. I'm waiting for the patch. OK, but I will review your patch first. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Fri, 24 Aug 2018 21:45:35 +0900, Etsuro Fujita wrote in <5b7ffdef.6020...@lab.ntt.co.jp> > (2018/08/21 11:01), Kyotaro HORIGUCHI wrote: > > At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro > > Fujita wrote > > in<5b72c1ae.8010...@lab.ntt.co.jp> > >> (2018/08/09 22:04), Etsuro Fujita wrote: > >>> (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: > > >> I spent more time looking at the patch. ISTM that the patch well > >> suppresses the effect of the tuple-descriptor expansion by making > >> changes to code in the planner and executor (and ruleutils.c), but I'm > >> still not sure that the patch is the right direction to go in, because > >> ISTM that expanding the tuple descriptor on the fly might be a wart. > > > The exapansion should be safe if the expanded descriptor has the > > same defitions for base columns and all the extended coulumns are > > junks. The junk columns should be ignored by unrelated nodes and > > they are passed safely as far as ForeignModify passes tuples as > > is from underlying ForeignScan to ForeignUpdate/Delete. > > I'm not sure that would be really safe. Does that work well when > EvalPlanQual, for example? Nothing. The reason was that core just doesn't know about the extended portion. So only problematic case was ExprEvalWholeRowVar, where explicit sanity check is perfomed. But, I think it is a ugly wart as you said. So the latest patch generates full fdw_scan_tlist. > > https://www.postgresql.org/docs/devel/static/fdw-planning.html > > > > Hmm. Thanks for the pointer, it seems to need rewrite. However, > > it doesn't seem to work for non-join foreign scans, since the > > core igonres it and uses local table definition. > > Really? No, I was wrong here. The core doesn't consider the case where fdw_scan_tlist has attributes that is not a part of base relation but it doesn't affect the description. > >> You wrote: > >>> I'm not sure whether the following ponits are valid. > >>> > >>> - If fdw_scan_tlist is used for simple relation scans, this would > >>> break the case. (ExecInitForeignScan, set_foreignscan_references) > >> > >> Some FDWs might already use that list for the improved efficiency for > >> simple foreign table scans as explained above, so we should avoid > >> breaking that. > > > > I considered to use fdw_scan_tlist in that way but the core is > > assuming that foreign scans with scanrelid> 0 uses the relation > > descriptor. > > Could you elaborate a bit more on this? After all I found that core uses fdw_scan_tlist if any and the attached patch doen't modify the "affected" part. Sorry, it's still hot here:p > > Do you have any example for that? > > I don't know such an example, but in my understanding, the core allows > the FDW to do that. As above, I agreed. Sorry for the bogosity. > >> If we take the Param-based approach suggested by Tom, I suspect there > >> would be no need to worry about at least those things, so I'll try to > >> update your patch as such, if there are no objections from you (or > >> anyone else). > > > PARAM_EXEC is single storage side channel that can work as far as > > it is set and read while each tuple is handled. In this case > > postgresExecForeignUpdate/Delete must be called before > > postgresIterateForeignScan returns the next tuple. An apparent > > failure case for this usage is the join-update case below. > > > > https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyot...@lab.ntt.co.jp > > What I have in mind would be to 1) create a tlist that contains not > only Vars/PHVs but Params, for each join rel involving the target rel > so we ensure that the Params will propagate up through all join plan > steps, and 2) convert a join rel's tlist Params into Vars referencing > the same Params in the tlists for the outer/inner rels, by setrefs.c. > I think that would probably work well even for the case you mentioned > above. Maybe I'm missing something, though. As I wrote above, the problem was not param id propagation but the per-query storage for a parameter holded in econtext. PARAM_EXEC is assumed to be used between outer and inner relations of a nestloop or retrieval from sub-query retrieval as commented in primnodes.h. >PARAM_EXEC: The parameter is an internal executor parameter, used >for passing values into and out of sub-queries or from >nestloop joins to their inner scans. >For historical reasons, such parameters are numbered from 0. >These numbers are independent of PARAM_EXTERN numbers. Anyway the odds are high that I'm missing far more than you. > Sorry for the delay. Nope. Thank you for the comment and I'm waiting for the patch. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/21 11:01), Kyotaro HORIGUCHI wrote: At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita wrote in<5b72c1ae.8010...@lab.ntt.co.jp> (2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. The exapansion should be safe if the expanded descriptor has the same defitions for base columns and all the extended coulumns are junks. The junk columns should be ignored by unrelated nodes and they are passed safely as far as ForeignModify passes tuples as is from underlying ForeignScan to ForeignUpdate/Delete. I'm not sure that would be really safe. Does that work well when EvalPlanQual, for example? You wrote: Several places seems to be assuming that fdw_scan_tlist may be used foreign scan on simple relation but I didn't find that actually happens. Yeah, currently, postgres_fdw and file_fdw don't use that list for simple foreign table scans, but it could be used to improve the efficiency for those scans, as explained in fdwhandler.sgml: AnotherForeignScan field that can be filled by FDWs isfdw_scan_tlist, which describes the tuples returned by the FDW for this plan node. For simple foreign table scans this can be set toNIL, implying that the returned tuples have the row type declared for the foreign table. A non-NIL value must be a target list (list ofTargetEntrys) containing Vars and/or expressions representing the returned columns. This might be used, for example, to show that the FDW has omitted some columns that it noticed won't be needed for the query. Also, if the FDW can compute expressions used by the query more cheaply than can be done locally, it could add those expressions tofdw_scan_tlist. Note that join plans (created from paths made by GetForeignJoinPaths) must always supplyfdw_scan_tlist to describe the set of columns they will return. https://www.postgresql.org/docs/devel/static/fdw-planning.html Hmm. Thanks for the pointer, it seems to need rewrite. However, it doesn't seem to work for non-join foreign scans, since the core igonres it and uses local table definition. Really? You wrote: I'm not sure whether the following ponits are valid. - If fdw_scan_tlist is used for simple relation scans, this would break the case. (ExecInitForeignScan, set_foreignscan_references) Some FDWs might already use that list for the improved efficiency for simple foreign table scans as explained above, so we should avoid breaking that. I considered to use fdw_scan_tlist in that way but the core is assuming that foreign scans with scanrelid> 0 uses the relation descriptor. Could you elaborate a bit more on this? Do you have any example for that? I don't know such an example, but in my understanding, the core allows the FDW to do that. If we take the Param-based approach suggested by Tom, I suspect there would be no need to worry about at least those things, so I'll try to update your patch as such, if there are no objections from you (or anyone else). PARAM_EXEC is single storage side channel that can work as far as it is set and read while each tuple is handled. In this case postgresExecForeignUpdate/Delete must be called before postgresIterateForeignScan returns the next tuple. An apparent failure case for this usage is the join-update case below. https://www.postgresql.org/message-id/20180605.191032.256535589.horiguchi.kyot...@lab.ntt.co.jp What I have in mind would be to 1) create a tlist that contains not only Vars/PHVs but Params, for each join rel involving the target rel so we ensure that the Params will propagate up through all join plan steps, and 2) convert a join rel's tlist Params into Vars referencing the same Params in the tlists for the outer/inner rels, by setrefs.c. I think that would probably work well even for the case you mentioned above. Maybe I'm missing something, though. Sorry for the delay. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Sorry, I sent older version, which is logically same but contains some whitespace problems. I resend only 0003 by this mail. At Fri, 24 Aug 2018 16:51:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20180824.165131.45788857.horiguchi.kyot...@lab.ntt.co.jp> > Hello. > > At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > wrote in > <20180821.110132.261184472.horiguchi.kyot...@lab.ntt.co.jp> > > > You wrote: > > > >Several places seems to be assuming that fdw_scan_tlist may be > > > >used foreign scan on simple relation but I didn't find that > > > >actually happens. > > > > > > Yeah, currently, postgres_fdw and file_fdw don't use that list for > > > simple foreign table scans, but it could be used to improve the > > > efficiency for those scans, as explained in fdwhandler.sgml: > ... > > I'll put more consideration on using fdw_scan_tlist in the > > documented way. > > Done. postgres_fdw now generates full fdw_scan_tlist (as > documented) for foreign relations with junk columns, but a small > change in core was needed. However it is far less invasive than > the previous version and I believe that it dones't harm > maybe-existing use of fdw_scan_tlist on non-join rels. > > The previous patch didn't show "tableoid" in the Output list (as > "") of explain output but this does correctly by > referring to rte->eref->colnames. I believe no other FDW has > expanded foreign relation even if it uses fdw_scan_tlist for > ForeignScan on a base relation so it won't harm them. > > Since this uses fdw_scan_tlist so it is theoretically > back-patchable back to 9.6. This patch applies on top of the > current master. > > Please find the attached three files. > > 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch > > This should fail for unpatched postgres_fdw. (Just for demonstration) > > 0002-Core-side-modification-for-PgFDW-foreign-update-fix.patch > > Core side change which allows fdw_scan_tlist to have extra > columns that is not defined in the base relation. > > 0003-Fix-of-foreign-update-bug-of-PgFDW.patch > > Fix of postgres_fdw for this problem. > > 0004-Regtest-change-for-PgFDW-foreign-update-fix.patch > > Regression test change separated for readability. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From b95571ac7cf15101bfa045354a82befe074ecc55 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Fri, 24 Aug 2018 16:17:24 +0900 Subject: [PATCH 3/4] Fix of foreign update bug of PgFDW Postgres_fdw wrongly behavoes in updating foreign tables on a remote partitioned table when direct modify is not used. This is because postgres_fdw is forgetting that two different tuples with the same ctid may come in the case. With this patch it uses remote tableoid in addition to ctid to distinguish a remote tuple. --- contrib/postgres_fdw/deparse.c | 149 +++--- contrib/postgres_fdw/postgres_fdw.c | 291 +++- 2 files changed, 344 insertions(+), 96 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 6001f4d25e..c4cd6a7249 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1037,6 +1037,15 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, */ deparseExplicitTargetList(tlist, false, retrieved_attrs, context); } + else if (tlist != NIL) + { + /* + * The given tlist is that of base relation's expanded with junk + * columns. + */ + context->params_list = NULL; + deparseExplicitTargetList(tlist, false, retrieved_attrs, context); + } else { /* @@ -1088,6 +1097,42 @@ deparseFromExpr(List *quals, deparse_expr_cxt *context) } } +/* + * Adds one element in target/returning list if it is in attrs_used. + * + * If deparsestr is given, just use it. Otherwise resolves the name using rte. + */ +static inline void +deparseAddTargetListItem(StringInfo buf, + List **retrieved_attrs, Bitmapset *attrs_used, + Index rtindex, AttrNumber attnum, + char *deparsestr, RangeTblEntry *rte, + bool is_returning, bool qualify_col, + bool have_wholerow, bool *first) +{ + if (!have_wholerow && + !bms_is_member(attnum - FirstLowInvalidHeapAttributeNumber, attrs_used)) + return; + + if (!*first) + appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); + *first = false; + + if (deparsestr) + { + if (qualify_col) + ADD_REL_QUALIFIER(buf, rtindex); + + appendStringInfoString(buf, deparsestr); + } + else + deparseColumnRef(buf, rtindex, attnum, rte, qualify_col); + + *retrieved_attrs = lappend_int(*retrieved_attrs, attnum); +} + /* * Emit a target list that retrieves the columns specified in attrs_used. * This is used for both SELECT and RETURNING targetlists; the is_returning @@ -1128,58 +1173,28 @@ deparseTargetList(StringInfo buf, if (attr->attisdropped) continue; - if
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20180821.110132.261184472.horiguchi.kyot...@lab.ntt.co.jp> > > You wrote: > > >Several places seems to be assuming that fdw_scan_tlist may be > > >used foreign scan on simple relation but I didn't find that > > >actually happens. > > > > Yeah, currently, postgres_fdw and file_fdw don't use that list for > > simple foreign table scans, but it could be used to improve the > > efficiency for those scans, as explained in fdwhandler.sgml: ... > I'll put more consideration on using fdw_scan_tlist in the > documented way. Done. postgres_fdw now generates full fdw_scan_tlist (as documented) for foreign relations with junk columns having a small change in core side. However it is far less invasive than the previous version and I believe that it dones't harm maybe-existing use of fdw_scan_tlist on non-join rels (that is, in the case of a subset of relation columns). The previous patch didn't show "tableoid" in the Output list (as "") of explain output but this does correctly by referring to rte->eref->colnames. I believe no other FDW has expanded foreign relation even if it uses fdw_scan_tlist for ForeignScan on a base relation so it won't harm them. One arguable behavior change is about wholrow vars. Currently it refferes local tuple with all columns but it is explicitly fetched as ROW() after this patch applied. This could be fixed but not just now. Part of 0004-: - Output: f1, ''::text, ctid, rem1.* - Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE + Output: f1, ''::text, tableoid, ctid, rem1.* + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE Since this uses fdw_scan_tlist so it is theoretically back-patchable back to 9.6. This patch applies on top of the current master. Please find the attached three files. 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch This should fail for unpatched postgres_fdw. (Just for demonstration) 0002-Core-side-modification-for-PgFDW-foreign-update-fix.patch Core side change which allows fdw_scan_tlist to have extra columns that is not defined in the base relation. 0003-Fix-of-foreign-update-bug-of-PgFDW.patch Fix of postgres_fdw for this problem. 0004-Regtest-change-for-PgFDW-foreign-update-fix.patch Regression test change separated for readability. regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From fba71c319d1008f6dc198b8585c41f7ff0a708f1 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Fri, 24 Aug 2018 15:39:14 +0900 Subject: [PATCH 1/4] Add test for postgres_fdw foreign parition update This add a test for the failure of updating foreign partitioned table due to lack of distinction of remote child tables. This should fail. --- contrib/postgres_fdw/expected/postgres_fdw.out | 62 ++ contrib/postgres_fdw/sql/postgres_fdw.sql | 30 + 2 files changed, 92 insertions(+) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index d912bd9d54..dd4864f006 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -7568,6 +7568,68 @@ drop table loct1; drop table loct2; drop table parent; -- === +-- test update foreign partiton table +-- === +CREATE TABLE p1 (a int, b int); +CREATE TABLE c1 (LIKE p1) INHERITS (p1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE TABLE c2 (LIKE p1) INHERITS (p1); +NOTICE: merging column "a" with inherited definition +NOTICE: merging column "b" with inherited definition +CREATE FOREIGN TABLE fp1 (a int, b int) + SERVER loopback OPTIONS (table_name 'p1'); +INSERT INTO c1 VALUES (0, 1); +INSERT INTO c2 VALUES (1, 1); +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1; + toiddiff | ctid | a | b +--+---+---+--- +0 | (0,1) | 0 | 1 +0 | (0,1) | 1 | 1 +(2 rows) + +-- random() causes non-direct foreign update +EXPLAIN (VERBOSE, COSTS OFF) + UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1; + QUERY PLAN +- + Update on public.fp1 + Remote SQL: UPDATE public.p1 SET b = $2 WHERE ctid = $1 + -> Foreign Scan on public.fp1 + Output: a, (b + 1), ctid + Filter: (random() <= '1'::double precision) + Remote SQL: SELECT a, b, ctid FROM public.p1 WHERE ((a = 0)) FOR UPDATE +(6 rows) + +UPDATE fp1 SET b = b + 1 WHERE a = 0 and random() <= 1; +-- Only one tuple should be updated +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Fujita-san thank you for the comment. At Tue, 14 Aug 2018 20:49:02 +0900, Etsuro Fujita wrote in <5b72c1ae.8010...@lab.ntt.co.jp> > (2018/08/09 22:04), Etsuro Fujita wrote: > > (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: > >> I choosed to expand tuple descriptor for junk column added to > >> foreign relaions. We might be better to have new member in > >> ForeignScan but I didn't so that we can backpatch it. > > > > I've not looked at the patch closely yet, but I'm not sure that it's a > > good idea to expand the tuple descriptor of the target relation on the > > fly so that it contains the remotetableoid as a non-system attribute > > of > > the target table. My concern is: is there not any risk in affecting > > some > > other part of the planner and/or the executor? (I was a bit surprised > > that the patch passes the regression tests successfully.) Yeah, me too. > I spent more time looking at the patch. ISTM that the patch well > suppresses the effect of the tuple-descriptor expansion by making > changes to code in the planner and executor (and ruleutils.c), but I'm > still not sure that the patch is the right direction to go in, because > ISTM that expanding the tuple descriptor on the fly might be a wart. The non-Var nodes seems to me the same as PARAM_EXEC, which works imperfectly for this purpose since tableoid must be in one-to-one correspondence with a tuple but differently from joins the correspondence is stired up by intermedate executor nodes in some cases. The exapansion should be safe if the expanded descriptor has the same defitions for base columns and all the extended coulumns are junks. The junk columns should be ignored by unrelated nodes and they are passed safely as far as ForeignModify passes tuples as is from underlying ForeignScan to ForeignUpdate/Delete. > >> What the patch does are: > >> > >> - This abuses ForeignScan->fdw_scan_tlist to store the additional > >> junk columns when foreign simple relation scan (that is, not a > >> join). > > > > I think that this issue was introduced in 9.3, which added > > postgres_fdw > > in combination with support for writable foreign tables, but > > fdw_scan_tlist was added to 9.5 as part of join-pushdown > > infrastructure, > > so my concern is that we might not be able to backpatch your patch to > > 9.3 and 9.4. Right. So I'm thinking that the older versions just get error for the failure case instead of get it work anyhow. Or we might be able to use tableoid in tuple header without emitting the local oid to users but I haven't find the way to do that. > Another concern about this: > > You wrote: > >Several places seems to be assuming that fdw_scan_tlist may be > >used foreign scan on simple relation but I didn't find that > >actually happens. > > Yeah, currently, postgres_fdw and file_fdw don't use that list for > simple foreign table scans, but it could be used to improve the > efficiency for those scans, as explained in fdwhandler.sgml: > > Another ForeignScan field that can be filled > by FDWs > is fdw_scan_tlist, which describes the > tuples returned by > the FDW for this plan node. For simple foreign table scans this can > be > set to NIL, implying that the returned tuples have > the > row type declared for the foreign table. A non-NIL > value must be a > target list (list of TargetEntrys) containing > Vars and/or > expressions representing the returned columns. This might be used, > for > example, to show that the FDW has omitted some columns that it noticed > won't be needed for the query. Also, if the FDW can compute > expressions > used by the query more cheaply than can be done locally, it could add > those expressions to fdw_scan_tlist. Note > that join > plans (created from paths made by > GetForeignJoinPaths) must > always supply fdw_scan_tlist to describe > the set of > columns they will return. https://www.postgresql.org/docs/devel/static/fdw-planning.html Hmm. Thanks for the pointer, it seems to need rewrite. However, it doesn't seem to work for non-join foreign scans, since the core igonres it and uses local table definition. This "tweak" won't be needed if it worked. > You wrote: > > I'm not sure whether the following ponits are valid. > > > > - If fdw_scan_tlist is used for simple relation scans, this would > >break the case. (ExecInitForeignScan, set_foreignscan_references) > > Some FDWs might already use that list for the improved efficiency for > simple foreign table scans as explained above, so we should avoid > breaking that. I considered to use fdw_scan_tlist in that way but the core is assuming that foreign scans with scanrelid > 0 uses the relation descriptor. Do you have any example for that? > If we take the Param-based approach suggested by Tom, I suspect there > would be no need to worry about at least those things, so I'll try to > update your
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/09 22:04), Etsuro Fujita wrote: (2018/08/08 17:30), Kyotaro HORIGUCHI wrote: I choosed to expand tuple descriptor for junk column added to foreign relaions. We might be better to have new member in ForeignScan but I didn't so that we can backpatch it. I've not looked at the patch closely yet, but I'm not sure that it's a good idea to expand the tuple descriptor of the target relation on the fly so that it contains the remotetableoid as a non-system attribute of the target table. My concern is: is there not any risk in affecting some other part of the planner and/or the executor? (I was a bit surprised that the patch passes the regression tests successfully.) I spent more time looking at the patch. ISTM that the patch well suppresses the effect of the tuple-descriptor expansion by making changes to code in the planner and executor (and ruleutils.c), but I'm still not sure that the patch is the right direction to go in, because ISTM that expanding the tuple descriptor on the fly might be a wart. What the patch does are: - This abuses ForeignScan->fdw_scan_tlist to store the additional junk columns when foreign simple relation scan (that is, not a join). I think that this issue was introduced in 9.3, which added postgres_fdw in combination with support for writable foreign tables, but fdw_scan_tlist was added to 9.5 as part of join-pushdown infrastructure, so my concern is that we might not be able to backpatch your patch to 9.3 and 9.4. Another concern about this: You wrote: >Several places seems to be assuming that fdw_scan_tlist may be >used foreign scan on simple relation but I didn't find that >actually happens. Yeah, currently, postgres_fdw and file_fdw don't use that list for simple foreign table scans, but it could be used to improve the efficiency for those scans, as explained in fdwhandler.sgml: Another ForeignScan field that can be filled by FDWs is fdw_scan_tlist, which describes the tuples returned by the FDW for this plan node. For simple foreign table scans this can be set to NIL, implying that the returned tuples have the row type declared for the foreign table. A non-NIL value must be a target list (list of TargetEntrys) containing Vars and/or expressions representing the returned columns. This might be used, for example, to show that the FDW has omitted some columns that it noticed won't be needed for the query. Also, if the FDW can compute expressions used by the query more cheaply than can be done locally, it could add those expressions to fdw_scan_tlist. Note that join plans (created from paths made by GetForeignJoinPaths) must always supply fdw_scan_tlist to describe the set of columns they will return. You wrote: > I'm not sure whether the following ponits are valid. > > - If fdw_scan_tlist is used for simple relation scans, this would >break the case. (ExecInitForeignScan, set_foreignscan_references) Some FDWs might already use that list for the improved efficiency for simple foreign table scans as explained above, so we should avoid breaking that. If we take the Param-based approach suggested by Tom, I suspect there would be no need to worry about at least those things, so I'll try to update your patch as such, if there are no objections from you (or anyone else). Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/08/08 17:30), Kyotaro HORIGUCHI wrote: Please find the attached. Thanks for the patch, Horiguchi-san! At Fri, 3 Aug 2018 11:48:38 +0530, Ashutosh Bapat wrote in The purpose of non-Var node is to avoid adding the attribute to relation descriptor. Idea is to create a new node, which will act as a place holder for table oid or row id (whatever) to be fetched from the foreign server. I think so too. I don't understand why do you think we need it to be added to the relation descriptor. I don't understand that either. I choosed to expand tuple descriptor for junk column added to foreign relaions. We might be better to have new member in ForeignScan but I didn't so that we can backpatch it. I've not looked at the patch closely yet, but I'm not sure that it's a good idea to expand the tuple descriptor of the target relation on the fly so that it contains the remotetableoid as a non-system attribute of the target table. My concern is: is there not any risk in affecting some other part of the planner and/or the executor? (I was a bit surprised that the patch passes the regression tests successfully.) To avoid expanding the tuple descriptor, I'm wondering whether we could add a Param representing remotetableoid, not a Var undefined anywhere in the system catalogs, as mentioned above? What the patch does are: - This abuses ForeignScan->fdw_scan_tlist to store the additional junk columns when foreign simple relation scan (that is, not a join). I think that this issue was introduced in 9.3, which added postgres_fdw in combination with support for writable foreign tables, but fdw_scan_tlist was added to 9.5 as part of join-pushdown infrastructure, so my concern is that we might not be able to backpatch your patch to 9.3 and 9.4. That's it for now. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. Please find the attached. At Fri, 3 Aug 2018 11:48:38 +0530, Ashutosh Bapat wrote in > The purpose of non-Var node is to avoid adding the attribute to > relation descriptor. Idea is to create a new node, which will act as a > place holder for table oid or row id (whatever) to be fetched from the > foreign server. I don't understand why do you think we need it to be > added to the relation descriptor. I choosed to expand tuple descriptor for junk column added to foreign relaions. We might be better to have new member in ForeignScan but I didn't so that we can backpatch it. What the patch does are: - This abuses ForeignScan->fdw_scan_tlist to store the additional junk columns when foreign simple relation scan (that is, not a join). Several places seems to be assuming that fdw_scan_tlist may be used foreign scan on simple relation but I didn't find that actually happens. This let us avoid adding new data members to core data structure. Separate member would be preferable for new version. - The remote OID request is added to targetlist as a non-system junk column. get_relation_info exands per-column storage in creating RelOptInfo so that the additional junk columns can be handled. - ExecInitForeignScan is changed so that it expands created tuple descriptor if it finds the junk columns stored in fdw_scan_tlist so that make_tuple_from_result_row can store them. ( ExecEvalWholeRowVar needed to modify so that it ignores the expanded portion of tuple descriptor.) I'm not sure whether the following ponits are valid. - If fdw_scan_tlist is used for simple relation scans, this would break the case. (ExecInitForeignScan, set_foreignscan_references) - I'm using the name "tableoid" for the junk column but it also can be used in user query. The two points to different targets so it doesn't matter at all, except that it makes a bit confusing explain output. - Explain stuff doesn't have a crue for the name of the added junk. It is shown as in EXPLAIN output. | Update on public.fp | Remote SQL: UPDATE public.p SET b = $3 WHERE tableoid = $1 AND ctid = $2 | -> Foreign Scan on public.fp | Output: a, (b + 1), "", ctid | Filter: (random() <= '1'::double precision) | Remote SQL: SELECT a, b, tableoid AS __remote_tableoid, ctid | FROM public.p WHERE ((a = 0)) FOR UPDATE regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From fe660a5ab953d68a671861479fce0b3e60a57cd8 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Wed, 8 Aug 2018 12:14:58 +0900 Subject: [PATCH 2/2] Regression test for update/delete on foreign partitioned table Add test for foreign update on remote partitioned tables. --- contrib/postgres_fdw/expected/postgres_fdw.out | 221 - contrib/postgres_fdw/sql/postgres_fdw.sql | 27 +++ 2 files changed, 167 insertions(+), 81 deletions(-) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index f5498c62bd..9ae329ab4f 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -5497,15 +5497,15 @@ INSERT INTO ft2 (c1,c2,c3) SELECT id, id % 10, to_char(id, 'FM0') FROM generate_series(2001, 2010) id; EXPLAIN (verbose, costs off) UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;-- can't be pushed down -QUERY PLAN --- + QUERY PLAN + Update on public.ft2 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Remote SQL: UPDATE "S 1"."T 1" SET c3 = $2 WHERE ctid = $1 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8 + Remote SQL: UPDATE "S 1"."T 1" SET c3 = $3 WHERE tableoid = $1 AND ctid = $2 RETURNING "C 1", c2, c3, c4, c5, c6, c7, c8 -> Foreign Scan on public.ft2 - Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, ctid + Output: c1, c2, NULL::integer, 'bar'::text, c4, c5, c6, c7, c8, "", ctid Filter: (postgres_fdw_abs(ft2.c1) > 2000) - Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, ctid FROM "S 1"."T 1" FOR UPDATE + Remote SQL: SELECT "C 1", c2, c4, c5, c6, c7, c8, tableoid, ctid FROM "S 1"."T 1" FOR UPDATE (7 rows) UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *; @@ -5532,13 +5532,13 @@ UPDATE ft2 SET c3 = 'baz'
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Fri, Aug 3, 2018 at 9:43 AM, Kyotaro HORIGUCHI wrote: > > Something-like-but-other-hanVar node? I'm not sure it is needed, > because whatever node we add to the relation-tlist, we must add > the correspondence to the relation descriptor. And if we do that, > a Var works to point it. (Am I correctly understanding?) > The purpose of non-Var node is to avoid adding the attribute to relation descriptor. Idea is to create a new node, which will act as a place holder for table oid or row id (whatever) to be fetched from the foreign server. I don't understand why do you think we need it to be added to the relation descriptor. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Tue, 26 Jun 2018 10:19:45 +0530, Ashutosh Bapat wrote in > On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI > wrote: > > > >> But good thing is because of join and aggregate > >> push-down we already have ability to push arbitrary kinds of nodes > >> down to the foreign server through the targetlist. We should be able > >> to leverage that capability. It looks like a lot of change, which > >> again doesn't seem to be back-portable. > > > > After some struggles as you know, I agree to the opinion. As my > > first impression, giving (physical) base relations (*1) an > > ability to have junk attribute is rather straightforward. > > By giving base relations an ability to have junk attribute you mean to > add junk attribute in the targetlist of DML, something like > postgresAddForeignUpdateTargets(). You seem to be fine with the new Maybe. > node approach described above. Just confirm. Something-like-but-other-hanVar node? I'm not sure it is needed, because whatever node we add to the relation-tlist, we must add the correspondence to the relation descriptor. And if we do that, a Var works to point it. (Am I correctly understanding?) > > > > Well, is our conclusion here like this? > > > > - For existing versions, check the errorneous situation and ERROR out. > > (documentaion will be needed.) > > > > - For 12, we try the above thing. > > I think we have to see how invasive the fix is, and whether it's > back-portable. If it's back-portable, we back-port it and the problem > is fixed in previous versions as well. If not, we fix previous > versions to ERROR out instead of corrupting the database. Mmm. Ok, I try to make a patch. Please wait for a while. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello, thank you for the comment. At Wed, 01 Aug 2018 21:21:57 +0900, Etsuro Fujita wrote in <5b61a5e5.6010...@lab.ntt.co.jp> > (2018/06/12 12:19), Kyotaro HORIGUCHI wrote: > > I have demonstrated and actually shown a problem of the > > PARAM_EXEC case. > > > A. Just detecting and reporting/erroring the problematic case. > > > > B. Giving to Sort-like nodes an ability to convert PARAMS into > > junk columns. > > > > C. Adding a space for 64bit tuple identifier in a tuple header. > > > > D. Somehow inhibiting tuple-storing node like Sort between. (This > >should break something working.) > > > > > > B seems to have possibility to fix this but I haven't have a > > concrete design of it. > > I'm just wondering whether we could modify the planner (or executor) > so that Params can propagate up to the ModifyTable node through all > joins like Vars/PHVs. Yeah, it's mentioned somewhere upthread. The most large obstacle in my view is the fact that the tuple descriptor for an RTE_RELATION baserel is tied with the relation definition. So we need to separate the two to use "(junk) Vars/PHVs" to do that purpose. The four above is based on the premise of EXEC_PARAMS. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
(2018/06/12 12:19), Kyotaro HORIGUCHI wrote: I have demonstrated and actually shown a problem of the PARAM_EXEC case. A. Just detecting and reporting/erroring the problematic case. B. Giving to Sort-like nodes an ability to convert PARAMS into junk columns. C. Adding a space for 64bit tuple identifier in a tuple header. D. Somehow inhibiting tuple-storing node like Sort between. (This should break something working.) B seems to have possibility to fix this but I haven't have a concrete design of it. I'm just wondering whether we could modify the planner (or executor) so that Params can propagate up to the ModifyTable node through all joins like Vars/PHVs. Best regards, Etsuro Fujita
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Tue, Jun 26, 2018 at 9:59 AM, Kyotaro HORIGUCHI wrote: > >> But good thing is because of join and aggregate >> push-down we already have ability to push arbitrary kinds of nodes >> down to the foreign server through the targetlist. We should be able >> to leverage that capability. It looks like a lot of change, which >> again doesn't seem to be back-portable. > > After some struggles as you know, I agree to the opinion. As my > first impression, giving (physical) base relations (*1) an > ability to have junk attribute is rather straightforward. By giving base relations an ability to have junk attribute you mean to add junk attribute in the targetlist of DML, something like postgresAddForeignUpdateTargets(). You seem to be fine with the new node approach described above. Just confirm. > > Well, is our conclusion here like this? > > - For existing versions, check the errorneous situation and ERROR out. > (documentaion will be needed.) > > - For 12, we try the above thing. I think we have to see how invasive the fix is, and whether it's back-portable. If it's back-portable, we back-port it and the problem is fixed in previous versions as well. If not, we fix previous versions to ERROR out instead of corrupting the database. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Fri, 15 Jun 2018 11:19:21 +0530, Ashutosh Bapat wrote in > On Tue, Jun 12, 2018 at 8:49 AM, Kyotaro HORIGUCHI > wrote: > > Thanks for the discussion. > > > > At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat > > wrote in > > > >> What's the problem that you faced? > > > > The required condtion for PARAM_EXEC to work is that executor > > ensures the correspondence between the setter the reader of a > > param like ExecNestLoop is doing. The Sort node breaks the > > correspondence between the tuple obtained from the Foreign Scan > > and that ForeignUpdate is updating. Specifically Foreign Update > > upadtes the first tuple using the tableoid for the last tuple > > from the Foreign Scan. > > Ok. Thanks for the explanation. > > > > >> > Even if this worked fine, it cannot be back-patched. We need an > >> > extra storage moves together with tuples or prevent sorts or > >> > something like from being inserted there. > >> > >> I think your approach still has the same problem that it's abusing the > >> tableOid field in the heap tuple to store tableoid from the remote as > >> well as local table. That's what Robert and Tom objected to [1], [2] > > > > It's wrong understanding. PARAM_EXEC conveys remote tableoids > > outside tuples and each tuple is storing correct (= local) > > tableoid. > > In the patch I saw that we were setting tableoid field of HeapTuple to > the remote table oid somewhere. Hence the comment. I might be wrong. You should have seen make_tuple_from_result_row. The patch sets real tableOid to returning tuples since I didn't find an usable storage for the per-tuple value. Afterwards the parameters are set from tup->t_tableOid in postgresIterateForeignScan. ForeignNext overwrites t_tableOid of returned tuples with the foreign table's OID if system column is requested. > >> > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat > >> > wrote in > >> > > >> >> I am not suggesting to commit 0003 in my patch set, but just 0001 and > >> >> 0002 which just raise an error when multiple rows get updated when > >> >> only one row is expected to be updated. > >> > > >> > So I agree to commit the two at least in order to prevent doing > >> > wrong silently. > >> > >> I haven't heard any committer's opinion on this one yet. > >> > >> [1] > >> https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=hcu4n30yi9_pe175ittbfk6t8jxzwkra...@mail.gmail.com > >> [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us > > > > Agreed. We need any comment to proceed. > > > > I have demonstrated and actually shown a problem of the > > PARAM_EXEC case. (It seems a bit silly that I actually found the > > problem after it became almost workable, though..) > > I think the general idea behind Tom's suggestion is that we have to > use some node other than Var node when we update the targetlist with > junk columns. He suggested Param since that gives us some place to > store remote tableoid. But if that's not working, another idea (that > Tom mentioned during our discussion at PGCon) is to invent a new node > type like ForeignTableOid or something like that, which gets deparsed > to "tableoid" and evaluated to the table oid on the foreign server. > That will not work as it is since postgres_fdw code treats a foreign > table almost like a local table in many ways e.g. it uses attr_used to I think treating a foreign table as a local object is right. But anyway it doesn't work. > know which attributes are to be requested from the foreign server, > build_tlist_to_deparse() only pulls Var nodes from the targelist of > foreign table and so on. All of those assumptions will need to change > with this approach. Maybe. I agree. > But good thing is because of join and aggregate > push-down we already have ability to push arbitrary kinds of nodes > down to the foreign server through the targetlist. We should be able > to leverage that capability. It looks like a lot of change, which > again doesn't seem to be back-portable. After some struggles as you know, I agree to the opinion. As my first impression, giving (physical) base relations (*1) an ability to have junk attribute is rather straightforward. Well, is our conclusion here like this? - For existing versions, check the errorneous situation and ERROR out. (documentaion will be needed.) - For 12, we try the above thing. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Tue, Jun 12, 2018 at 8:49 AM, Kyotaro HORIGUCHI wrote: > Thanks for the discussion. > > At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat > wrote in > >> On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI >> wrote: >> > Hello. >> > >> > At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro >> > HORIGUCHI wrote in >> > <20180604.205828.208262556.horiguchi.kyot...@lab.ntt.co.jp> >> >> It fails on some join-pushdown cases since it doesn't add tid >> >> columns to join tlist. I suppose that build_tlist_to_deparse >> >> needs something but I'll consider further tomorrow. >> > >> > I made it work with a few exceptions and bumped. PARAM_EXEC >> > doesn't work at all in a case where Sort exists between >> > ForeignUpdate and ForeignScan. >> > >> > = >> > explain (verbose, costs off) >> > update bar set f2 = f2 + 100 >> > from >> > ( select f1 from foo union all select f1+3 from foo ) ss >> > where bar.f1 = ss.f1; >> > QUERY PLAN >> > - >> > Update on public.bar >> >Update on public.bar >> >Foreign Update on public.bar2 >> > Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND >> > ctid = $2 >> > ... >> >-> Merge Join >> > Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1)) >> > Merge Cond: (bar2.f1 = foo.f1) >> > -> Sort >> >Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid >> >Sort Key: bar2.f1 >> >-> Foreign Scan on public.bar2 >> > Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, >> > bar2.ctid >> > Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM >> > public.loct2 FOR UPDATE >> > = >> >> What's the problem that you faced? > > The required condtion for PARAM_EXEC to work is that executor > ensures the correspondence between the setter the reader of a > param like ExecNestLoop is doing. The Sort node breaks the > correspondence between the tuple obtained from the Foreign Scan > and that ForeignUpdate is updating. Specifically Foreign Update > upadtes the first tuple using the tableoid for the last tuple > from the Foreign Scan. Ok. Thanks for the explanation. > >> > Even if this worked fine, it cannot be back-patched. We need an >> > extra storage moves together with tuples or prevent sorts or >> > something like from being inserted there. >> >> I think your approach still has the same problem that it's abusing the >> tableOid field in the heap tuple to store tableoid from the remote as >> well as local table. That's what Robert and Tom objected to [1], [2] > > It's wrong understanding. PARAM_EXEC conveys remote tableoids > outside tuples and each tuple is storing correct (= local) > tableoid. In the patch I saw that we were setting tableoid field of HeapTuple to the remote table oid somewhere. Hence the comment. I might be wrong. > >> > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat >> > wrote in >> > >> >> I am not suggesting to commit 0003 in my patch set, but just 0001 and >> >> 0002 which just raise an error when multiple rows get updated when >> >> only one row is expected to be updated. >> > >> > So I agree to commit the two at least in order to prevent doing >> > wrong silently. >> >> I haven't heard any committer's opinion on this one yet. >> >> [1] >> https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=hcu4n30yi9_pe175ittbfk6t8jxzwkra...@mail.gmail.com >> [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us > > Agreed. We need any comment to proceed. > > I have demonstrated and actually shown a problem of the > PARAM_EXEC case. (It seems a bit silly that I actually found the > problem after it became almost workable, though..) I think the general idea behind Tom's suggestion is that we have to use some node other than Var node when we update the targetlist with junk columns. He suggested Param since that gives us some place to store remote tableoid. But if that's not working, another idea (that Tom mentioned during our discussion at PGCon) is to invent a new node type like ForeignTableOid or something like that, which gets deparsed to "tableoid" and evaluated to the table oid on the foreign server. That will not work as it is since postgres_fdw code treats a foreign table almost like a local table in many ways e.g. it uses attr_used to know which attributes are to be requested from the foreign server, build_tlist_to_deparse() only pulls Var nodes from the targelist of foreign table and so on. All of those assumptions will need to change with this approach. But good thing is because of join and aggregate push-down we already have ability to push arbitrary kinds of nodes down to the foreign server through the targetlist. We should be able to leverage that capability. It looks like a lot of change, which again doesn't seem to be back-portable. -- Best
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Thanks for the discussion. At Thu, 7 Jun 2018 19:16:57 +0530, Ashutosh Bapat wrote in > On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI > wrote: > > Hello. > > > > At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > > wrote in > > <20180604.205828.208262556.horiguchi.kyot...@lab.ntt.co.jp> > >> It fails on some join-pushdown cases since it doesn't add tid > >> columns to join tlist. I suppose that build_tlist_to_deparse > >> needs something but I'll consider further tomorrow. > > > > I made it work with a few exceptions and bumped. PARAM_EXEC > > doesn't work at all in a case where Sort exists between > > ForeignUpdate and ForeignScan. > > > > = > > explain (verbose, costs off) > > update bar set f2 = f2 + 100 > > from > > ( select f1 from foo union all select f1+3 from foo ) ss > > where bar.f1 = ss.f1; > > QUERY PLAN > > - > > Update on public.bar > >Update on public.bar > >Foreign Update on public.bar2 > > Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND > > ctid = $2 > > ... > >-> Merge Join > > Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1)) > > Merge Cond: (bar2.f1 = foo.f1) > > -> Sort > >Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid > >Sort Key: bar2.f1 > >-> Foreign Scan on public.bar2 > > Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, > > bar2.ctid > > Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM > > public.loct2 FOR UPDATE > > = > > What's the problem that you faced? The required condtion for PARAM_EXEC to work is that executor ensures the correspondence between the setter the reader of a param like ExecNestLoop is doing. The Sort node breaks the correspondence between the tuple obtained from the Foreign Scan and that ForeignUpdate is updating. Specifically Foreign Update upadtes the first tuple using the tableoid for the last tuple from the Foreign Scan. > > Even if this worked fine, it cannot be back-patched. We need an > > extra storage moves together with tuples or prevent sorts or > > something like from being inserted there. > > I think your approach still has the same problem that it's abusing the > tableOid field in the heap tuple to store tableoid from the remote as > well as local table. That's what Robert and Tom objected to [1], [2] It's wrong understanding. PARAM_EXEC conveys remote tableoids outside tuples and each tuple is storing correct (= local) tableoid. > > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat > > wrote in > > > >> I am not suggesting to commit 0003 in my patch set, but just 0001 and > >> 0002 which just raise an error when multiple rows get updated when > >> only one row is expected to be updated. > > > > So I agree to commit the two at least in order to prevent doing > > wrong silently. > > I haven't heard any committer's opinion on this one yet. > > [1] > https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=hcu4n30yi9_pe175ittbfk6t8jxzwkra...@mail.gmail.com > [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us Agreed. We need any comment to proceed. I have demonstrated and actually shown a problem of the PARAM_EXEC case. (It seems a bit silly that I actually found the problem after it became almost workable, though..) If tuples were not copied we will be able to use the address to identify a tuple but actually they are. (Anyway we soudn't do that.) A. Just detecting and reporting/erroring the problematic case. B. Giving to Sort-like nodes an ability to convert PARAMS into junk columns. C. Adding a space for 64bit tuple identifier in a tuple header. D. Somehow inhibiting tuple-storing node like Sort between. (This should break something working.) B seems to have possibility to fix this but I haven't have a concrete design of it. With C, I see 2 bits of room in infomask2 and we can use one of the bits to indicate that the tuple has an extra 64-bit tuple identifier. It could be propagated to desired place but I'm not sure it is in acceptable shape. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Tue, Jun 5, 2018 at 3:40 PM, Kyotaro HORIGUCHI wrote: > Hello. > > At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI > wrote in > <20180604.205828.208262556.horiguchi.kyot...@lab.ntt.co.jp> >> It fails on some join-pushdown cases since it doesn't add tid >> columns to join tlist. I suppose that build_tlist_to_deparse >> needs something but I'll consider further tomorrow. > > I made it work with a few exceptions and bumped. PARAM_EXEC > doesn't work at all in a case where Sort exists between > ForeignUpdate and ForeignScan. > > = > explain (verbose, costs off) > update bar set f2 = f2 + 100 > from > ( select f1 from foo union all select f1+3 from foo ) ss > where bar.f1 = ss.f1; > QUERY PLAN > - > Update on public.bar >Update on public.bar >Foreign Update on public.bar2 > Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid > = $2 > ... >-> Merge Join > Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1)) > Merge Cond: (bar2.f1 = foo.f1) > -> Sort >Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid >Sort Key: bar2.f1 >-> Foreign Scan on public.bar2 > Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, > bar2.ctid > Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM > public.loct2 FOR UPDATE > = What's the problem that you faced? > > Even if this worked fine, it cannot be back-patched. We need an > extra storage moves together with tuples or prevent sorts or > something like from being inserted there. I think your approach still has the same problem that it's abusing the tableOid field in the heap tuple to store tableoid from the remote as well as local table. That's what Robert and Tom objected to [1], [2] > > > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat > wrote in > >> I am not suggesting to commit 0003 in my patch set, but just 0001 and >> 0002 which just raise an error when multiple rows get updated when >> only one row is expected to be updated. > > So I agree to commit the two at least in order to prevent doing > wrong silently. I haven't heard any committer's opinion on this one yet. [1] https://www.postgresql.org/message-id/CA+TgmobUHHZiDR=hcu4n30yi9_pe175ittbfk6t8jxzwkra...@mail.gmail.com [2] https://www.postgresql.org/message-id/7936.1526590932%40sss.pgh.pa.us -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20180604.205828.208262556.horiguchi.kyot...@lab.ntt.co.jp> > It fails on some join-pushdown cases since it doesn't add tid > columns to join tlist. I suppose that build_tlist_to_deparse > needs something but I'll consider further tomorrow. I made it work with a few exceptions and bumped. PARAM_EXEC doesn't work at all in a case where Sort exists between ForeignUpdate and ForeignScan. = explain (verbose, costs off) update bar set f2 = f2 + 100 from ( select f1 from foo union all select f1+3 from foo ) ss where bar.f1 = ss.f1; QUERY PLAN - Update on public.bar Update on public.bar Foreign Update on public.bar2 Remote SQL: UPDATE public.loct2 SET f2 = $3 WHERE tableoid = $1 AND ctid = $2 ... -> Merge Join Output: bar2.f1, (bar2.f2 + 100), bar2.f3, (ROW(foo.f1)) Merge Cond: (bar2.f1 = foo.f1) -> Sort Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid Sort Key: bar2.f1 -> Foreign Scan on public.bar2 Output: bar2.f1, bar2.f2, bar2.f3, bar2.tableoid, bar2.ctid Remote SQL: SELECT f1, f2, f3, ctid, tableoid FROM public.loct2 FOR UPDATE = Even if this worked fine, it cannot be back-patched. We need an extra storage moves together with tuples or prevent sorts or something like from being inserted there. At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat wrote in > I am not suggesting to commit 0003 in my patch set, but just 0001 and > 0002 which just raise an error when multiple rows get updated when > only one row is expected to be updated. So I agree to commit the two at least in order to prevent doing wrong silently. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index d272719ff4..bff216f29d 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1049,9 +1049,16 @@ deparseSelectSql(List *tlist, bool is_subquery, List **retrieved_attrs, * can use NoLock here. */ Relation rel = heap_open(rte->relid, NoLock); + Bitmapset *attrs = fpinfo->attrs_used; + + if (root->parse->commandType != CMD_UPDATE && + root->parse->commandType != CMD_DELETE) + attrs = bms_del_member(bms_copy(attrs), + TableOidAttributeNumber - + FirstLowInvalidHeapAttributeNumber); deparseTargetList(buf, rte, foreignrel->relid, rel, false, - fpinfo->attrs_used, false, retrieved_attrs); + attrs, false, retrieved_attrs); heap_close(rel, NoLock); } } @@ -1107,11 +1114,17 @@ deparseTargetList(StringInfo buf, bool qualify_col, List **retrieved_attrs) { + static int check_attrs[4]; + static char *check_attr_names[] = {"ctid", "oid", "tableoid"}; TupleDesc tupdesc = RelationGetDescr(rel); bool have_wholerow; bool first; int i; + check_attrs[0] = SelfItemPointerAttributeNumber; + check_attrs[1] = ObjectIdAttributeNumber; + check_attrs[2] = TableOidAttributeNumber; + check_attrs[3] = FirstLowInvalidHeapAttributeNumber; *retrieved_attrs = NIL; /* If there's a whole-row reference, we'll need all the columns. */ @@ -1143,13 +1156,16 @@ deparseTargetList(StringInfo buf, } } - /* - * Add ctid and oid if needed. We currently don't support retrieving any - * other system columns. - */ - if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) + for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++) { + int attr = check_attrs[i]; + char *attr_name = check_attr_names[i]; + + /* Add system columns if needed. */ + if (!bms_is_member(attr - FirstLowInvalidHeapAttributeNumber, + attrs_used)) + continue; + if (!first) appendStringInfoString(buf, ", "); else if (is_returning) @@ -1158,26 +1174,9 @@ deparseTargetList(StringInfo buf, if (qualify_col) ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "ctid"); + appendStringInfoString(buf, attr_name); - *retrieved_attrs = lappend_int(*retrieved_attrs, - SelfItemPointerAttributeNumber); - } - if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) - { - if (!first) - appendStringInfoString(buf, ", "); - else if (is_returning) - appendStringInfoString(buf, " RETURNING "); - first = false; - - if (qualify_col) - ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "oid"); - - *retrieved_attrs = lappend_int(*retrieved_attrs, - ObjectIdAttributeNumber); + *retrieved_attrs = lappend_int(*retrieved_attrs, attr); } /* Don't generate bad syntax if no undropped columns */ @@ -1725,7 +1724,7 @@ deparseUpdateSql(StringInfo buf,
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
At Mon, 04 Jun 2018 20:58:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20180604.205828.208262556.horiguchi.kyot...@lab.ntt.co.jp> > Hello. > > At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat > wrote in > > > I am not suggesting to commit 0003 in my patch set, but just 0001 and > > 0002 which just raise an error when multiple rows get updated when > > only one row is expected to be updated. > > I reconsidered Tom's suggestion and found a way to fix this > problem avoiding FDW-API change. The patch just sent contains changes of execnodes.h, which is useless. regres. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index d272719ff4..503e705c4c 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1107,11 +1107,17 @@ deparseTargetList(StringInfo buf, bool qualify_col, List **retrieved_attrs) { + static int check_attrs[4]; + static char *check_attr_names[] = {"ctid", "oid", "tableoid"}; TupleDesc tupdesc = RelationGetDescr(rel); bool have_wholerow; bool first; int i; + check_attrs[0] = SelfItemPointerAttributeNumber; + check_attrs[1] = ObjectIdAttributeNumber; + check_attrs[2] = TableOidAttributeNumber; + check_attrs[3] = FirstLowInvalidHeapAttributeNumber; *retrieved_attrs = NIL; /* If there's a whole-row reference, we'll need all the columns. */ @@ -1143,41 +1149,27 @@ deparseTargetList(StringInfo buf, } } - /* - * Add ctid and oid if needed. We currently don't support retrieving any - * other system columns. - */ - if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) + for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++) { - if (!first) - appendStringInfoString(buf, ", "); - else if (is_returning) - appendStringInfoString(buf, " RETURNING "); - first = false; + int attr = check_attrs[i]; + char *attr_name = check_attr_names[i]; - if (qualify_col) - ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "ctid"); + /* Add system columns if needed. */ + if (bms_is_member(attr - FirstLowInvalidHeapAttributeNumber, + attrs_used)) + { + if (!first) +appendStringInfoString(buf, ", "); + else if (is_returning) +appendStringInfoString(buf, " RETURNING "); + first = false; - *retrieved_attrs = lappend_int(*retrieved_attrs, - SelfItemPointerAttributeNumber); - } - if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) - { - if (!first) - appendStringInfoString(buf, ", "); - else if (is_returning) - appendStringInfoString(buf, " RETURNING "); - first = false; + if (qualify_col) +ADD_REL_QUALIFIER(buf, rtindex); + appendStringInfoString(buf, attr_name); - if (qualify_col) - ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "oid"); - - *retrieved_attrs = lappend_int(*retrieved_attrs, - ObjectIdAttributeNumber); + *retrieved_attrs = lappend_int(*retrieved_attrs, attr); + } } /* Don't generate bad syntax if no undropped columns */ @@ -1725,7 +1717,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, deparseRelation(buf, rel); appendStringInfoString(buf, " SET "); - pindex = 2; /* ctid is always the first param */ + pindex = 3; /* tableoid and ctid are always the first param */ first = true; foreach(lc, targetAttrs) { @@ -1739,7 +1731,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, appendStringInfo(buf, " = $%d", pindex); pindex++; } - appendStringInfoString(buf, " WHERE ctid = $1"); + appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2"); deparseReturningList(buf, rte, rtindex, rel, rel->trigdesc && rel->trigdesc->trig_update_after_row, @@ -1855,7 +1847,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte, { appendStringInfoString(buf, "DELETE FROM "); deparseRelation(buf, rel); - appendStringInfoString(buf, " WHERE ctid = $1"); + appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2"); deparseReturningList(buf, rte, rtindex, rel, rel->trigdesc && rel->trigdesc->trig_delete_after_row, diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 78b0f43ca8..7557d9add7 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -73,7 +73,9 @@ enum FdwScanPrivateIndex * String describing join i.e. names of relations being joined and types * of join, added when the scan is join */ - FdwScanPrivateRelations + FdwScanPrivateRelations, + + FdwScanTupleIdParamIds }; /* @@ -95,7 +97,8 @@ enum FdwModifyPrivateIndex /* has-returning flag (as an integer Value node) */ FdwModifyPrivateHasReturning, /* Integer list of attribute numbers retrieved by RETURNING */ - FdwModifyPrivateRetrievedAttrs + FdwModifyPrivateRetrievedAttrs, +
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Fri, 1 Jun 2018 10:21:39 -0400, Ashutosh Bapat wrote in > I am not suggesting to commit 0003 in my patch set, but just 0001 and > 0002 which just raise an error when multiple rows get updated when > only one row is expected to be updated. I reconsidered Tom's suggestion and found a way to fix this problem avoiding FDW-API change. To make use of PARAM_EXECs here, the attached PoC patch does the following things. No changes in the core side. - postgresAddForeignUpdateTargets is no longer useful, thus it is removed from fdw_function in the attached patch. - GetForeignRelSize registers table oid and ctid columns into attrs_used and a new member param_attrs on updates. - postgresGetForeignPlan assigns two PARAM_EXECs for the two values, then remember the paramids in fdw_private. - postgresPlanForeignModify searches for the parameters and remember their paramids. After that, doing the following things fixes the issue. - make_tuple_tuple_from_result_row receives remote table oid and stores it to the returned tuples. - postgresIterateForeignScan stores the values into remembered parameters. - postgresExecForeignUpdate/Delete read the parameters and specify remote victims using them accurately. It fails on some join-pushdown cases since it doesn't add tid columns to join tlist. I suppose that build_tlist_to_deparse needs something but I'll consider further tomorrow. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index d272719ff4..503e705c4c 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1107,11 +1107,17 @@ deparseTargetList(StringInfo buf, bool qualify_col, List **retrieved_attrs) { + static int check_attrs[4]; + static char *check_attr_names[] = {"ctid", "oid", "tableoid"}; TupleDesc tupdesc = RelationGetDescr(rel); bool have_wholerow; bool first; int i; + check_attrs[0] = SelfItemPointerAttributeNumber; + check_attrs[1] = ObjectIdAttributeNumber; + check_attrs[2] = TableOidAttributeNumber; + check_attrs[3] = FirstLowInvalidHeapAttributeNumber; *retrieved_attrs = NIL; /* If there's a whole-row reference, we'll need all the columns. */ @@ -1143,41 +1149,27 @@ deparseTargetList(StringInfo buf, } } - /* - * Add ctid and oid if needed. We currently don't support retrieving any - * other system columns. - */ - if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) + for (i = 0 ; check_attrs[i] != FirstLowInvalidHeapAttributeNumber ; i++) { - if (!first) - appendStringInfoString(buf, ", "); - else if (is_returning) - appendStringInfoString(buf, " RETURNING "); - first = false; + int attr = check_attrs[i]; + char *attr_name = check_attr_names[i]; - if (qualify_col) - ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "ctid"); + /* Add system columns if needed. */ + if (bms_is_member(attr - FirstLowInvalidHeapAttributeNumber, + attrs_used)) + { + if (!first) +appendStringInfoString(buf, ", "); + else if (is_returning) +appendStringInfoString(buf, " RETURNING "); + first = false; - *retrieved_attrs = lappend_int(*retrieved_attrs, - SelfItemPointerAttributeNumber); - } - if (bms_is_member(ObjectIdAttributeNumber - FirstLowInvalidHeapAttributeNumber, - attrs_used)) - { - if (!first) - appendStringInfoString(buf, ", "); - else if (is_returning) - appendStringInfoString(buf, " RETURNING "); - first = false; + if (qualify_col) +ADD_REL_QUALIFIER(buf, rtindex); + appendStringInfoString(buf, attr_name); - if (qualify_col) - ADD_REL_QUALIFIER(buf, rtindex); - appendStringInfoString(buf, "oid"); - - *retrieved_attrs = lappend_int(*retrieved_attrs, - ObjectIdAttributeNumber); + *retrieved_attrs = lappend_int(*retrieved_attrs, attr); + } } /* Don't generate bad syntax if no undropped columns */ @@ -1725,7 +1717,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, deparseRelation(buf, rel); appendStringInfoString(buf, " SET "); - pindex = 2; /* ctid is always the first param */ + pindex = 3; /* tableoid and ctid are always the first param */ first = true; foreach(lc, targetAttrs) { @@ -1739,7 +1731,7 @@ deparseUpdateSql(StringInfo buf, RangeTblEntry *rte, appendStringInfo(buf, " = $%d", pindex); pindex++; } - appendStringInfoString(buf, " WHERE ctid = $1"); + appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2"); deparseReturningList(buf, rte, rtindex, rel, rel->trigdesc && rel->trigdesc->trig_update_after_row, @@ -1855,7 +1847,7 @@ deparseDeleteSql(StringInfo buf, RangeTblEntry *rte, { appendStringInfoString(buf, "DELETE FROM "); deparseRelation(buf, rel); - appendStringInfoString(buf, " WHERE ctid = $1"); + appendStringInfoString(buf, " WHERE tableoid = $1 AND ctid = $2");
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Fri, Jun 1, 2018 at 7:43 AM, Kyotaro HORIGUCHI wrote: > On Thu, May 31, 2018 at 11:34 AM, Ashutosh Bapat > wrote: >> On Thu, May 31, 2018 at 7:36 PM, Tom Lane wrote: >>> What you want for the first part of that is basically like >>> generate_new_param() in subselect.c. We don't expose that publicly >>> at the moment, but we could, or maybe better to invent another wrapper >>> around it like SS_make_initplan_output_param. >> >> This looks like a lot of change which might take some time and may not > > I agree. It needs at least, in a short sight, an additional parameter > (PlannerInfo in a straightforwad way) for > postgresAddForeignUpdateTargets which is a change of FDW-API. > >> be back-portable. In the mean time, can we see if 0001 and 0002 >> patches are good and apply them. Those patches intend to stop the >> multiple rows on the foreign server being updated by throwing error >> (and aborting the transaction on the foreign server) when that >> happens. That will at least avoid silent corruption that happens today >> and should be back-portable. >> >> [1] >> https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=k...@mail.gmail.com > > Having said that I think that storing oids of the remote table in > local tableoid syscolumn is a breakage of the existing contract about > the field. (I wish this is comprehensible.) > However I haven't found a way to "fix" this without such breakage of > API thus it seems to me inevitable to leave this problem as a > restriction, we still can avoid the problematic behavior by explicitly > declaring remote tableoid column (like the "key" column option of > oracle-fdw). > > CREATE FOREIGN TABLE ft1 (rtoid oid, a int, blah, blah) SERVER sv > OPTIONS (remote_tableoid 'rtoid', table_name 'lt1'); > > However, of-course the proposed fix will work if we allow the > a-kind-of illegal usage of the local tableoid. And it seems to be a > way to cause a series of frequent changes on the same feature. > > Thoughts? I am not suggesting to commit 0003 in my patch set, but just 0001 and 0002 which just raise an error when multiple rows get updated when only one row is expected to be updated. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, May 31, 2018 at 11:34 AM, Ashutosh Bapat wrote: > On Thu, May 31, 2018 at 7:36 PM, Tom Lane wrote: >> What you want for the first part of that is basically like >> generate_new_param() in subselect.c. We don't expose that publicly >> at the moment, but we could, or maybe better to invent another wrapper >> around it like SS_make_initplan_output_param. > > This looks like a lot of change which might take some time and may not I agree. It needs at least, in a short sight, an additional parameter (PlannerInfo in a straightforwad way) for postgresAddForeignUpdateTargets which is a change of FDW-API. > be back-portable. In the mean time, can we see if 0001 and 0002 > patches are good and apply them. Those patches intend to stop the > multiple rows on the foreign server being updated by throwing error > (and aborting the transaction on the foreign server) when that > happens. That will at least avoid silent corruption that happens today > and should be back-portable. > > [1] > https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=k...@mail.gmail.com Having said that I think that storing oids of the remote table in local tableoid syscolumn is a breakage of the existing contract about the field. (I wish this is comprehensible.) However I haven't found a way to "fix" this without such breakage of API thus it seems to me inevitable to leave this problem as a restriction, we still can avoid the problematic behavior by explicitly declaring remote tableoid column (like the "key" column option of oracle-fdw). CREATE FOREIGN TABLE ft1 (rtoid oid, a int, blah, blah) SERVER sv OPTIONS (remote_tableoid 'rtoid', table_name 'lt1'); However, of-course the proposed fix will work if we allow the a-kind-of illegal usage of the local tableoid. And it seems to be a way to cause a series of frequent changes on the same feature. Thoughts?
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, May 31, 2018 at 7:36 PM, Tom Lane wrote: > Kyotaro HORIGUCHI writes: >> If my understanding about non-system junk columns in a base relation >> and identifiers of a foreign tuples are correct, what is needed here >> is giving base relations the ability to have such junk column. > > The core of the problem, I think, is the question of exactly what > postgresAddForeignUpdateTargets should put into the resjunk expressions > it adds to an update/delete query's targetlist. Per discussion yesterday, > up to now it's always emitted Vars referencing the foreign relation, > which is problematic because with that approach the desired info has > to be exposed as either a regular or system column of that relation. > But there's nothing saying that the expression has to be a Var. > > My thought about what we might do instead is that > postgresAddForeignUpdateTargets could reserve a PARAM_EXEC slot > and emit a Param node referencing that. Then at runtime, while > reading a potential target row from the remote, we fill that > param slot along with the regular scan tuple slot. > > What you want for the first part of that is basically like > generate_new_param() in subselect.c. We don't expose that publicly > at the moment, but we could, or maybe better to invent another wrapper > around it like SS_make_initplan_output_param. This looks like a lot of change which might take some time and may not be back-portable. In the mean time, can we see if 0001 and 0002 patches are good and apply them. Those patches intend to stop the multiple rows on the foreign server being updated by throwing error (and aborting the transaction on the foreign server) when that happens. That will at least avoid silent corruption that happens today and should be back-portable. [1] https://www.postgresql.org/message-id/CAFjFpRfK69ptCTNChBBk+LYMXFzJ92SW6NmG4HLn_1y7xFk=k...@mail.gmail.com -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Kyotaro HORIGUCHI writes: > If my understanding about non-system junk columns in a base relation > and identifiers of a foreign tuples are correct, what is needed here > is giving base relations the ability to have such junk column. The core of the problem, I think, is the question of exactly what postgresAddForeignUpdateTargets should put into the resjunk expressions it adds to an update/delete query's targetlist. Per discussion yesterday, up to now it's always emitted Vars referencing the foreign relation, which is problematic because with that approach the desired info has to be exposed as either a regular or system column of that relation. But there's nothing saying that the expression has to be a Var. My thought about what we might do instead is that postgresAddForeignUpdateTargets could reserve a PARAM_EXEC slot and emit a Param node referencing that. Then at runtime, while reading a potential target row from the remote, we fill that param slot along with the regular scan tuple slot. What you want for the first part of that is basically like generate_new_param() in subselect.c. We don't expose that publicly at the moment, but we could, or maybe better to invent another wrapper around it like SS_make_initplan_output_param. regards, tom lane
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Thanks. > I don't think this thread has reached a consensus on a design for a fix Right. If my understanding about non-system junk columns in a base relation and identifiers of a foreign tuples are correct, what is needed here is giving base relations the ability to have such junk column. I'm willing to work on that if I'm not on a wrong way here. -- Kyotaro Horiguchi
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello I don't think this thread has reached a consensus on a design for a fix, has it? Does anybody have a clear idea on a path forward? Is anybody working on a patch? Thanks -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
At Fri, 18 May 2018 15:31:07 -0400, Robert Haaswrote in > On Fri, May 18, 2018 at 4:29 AM, Kyotaro HORIGUCHI > wrote: > > I have reached to the same thought. > > > > The point here is that it is a base relation, which is not > > assumed to have additional columns not in its definition, > > including nonsystem junk columns. I'm not sure but it seems not > > that simple to give base relations an ability to have junk > > columns. > > Do you know where that assumption is embedded specifically? Taking the question literally, I see that add_vars_to_targetlist accepts neither nonsystem (including whole row vars) junk columns nor nonjunk columns that is not defined in the base relation. The first line of the following code is that. > Assert(attno >= rel->min_attr && attno <= rel->max_attr); > attno -= rel->min_attr; > if (rel->attr_needed[attno] == NULL) In the last line attr_needed is of an array of (max_attr - min_attr) elements, which is allocated in get_relation_info. I didn't go further so it might be easier than I'm thinking but anyway core-side modification (seems to me) is required at any rate. > If you're correct, then the FDW API is and always has been broken by > design for any remote data source that uses a row identifier other > than CTID, unless every foreign table definition always includes the > row identifier as an explicit column. I actually see that. Oracle-FDW needs to compose row identification by specifying "key" column option in relation definition and the key columns are added as resjunk column. This is the third (or, forth?) option of my comment upthread that was said as "not only bothersome". https://github.com/laurenz/oracle_fdw | Column options (from PostgreSQL 9.2 on) | key (optional, defaults to "false") | | If set to yes/on/true, the corresponding column on the foreign | Oracle table is considered a primary key column. For UPDATE and | DELETE to work, you must set this option on all columns that | belong to the table's primary key. > I might be wrong here, but I'm > pretty sure Tom wouldn't have committed this API in the first place > with such a glaring hole in the design. I see the API is still not broken in a sense, the ctid of postgres_fdw is necessarily that of remote table. If we have a reasonable mapping between remote tableoid:ctid and local ctid, it works as expected. But such mapping seems to be rather difficult to create since I don't find a generic way wihtout needing auxiliary information, and at least there's no guarantee that ctid has enough space for rows from multiple tables. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Fri, May 18, 2018 at 4:29 AM, Kyotaro HORIGUCHIwrote: > I have reached to the same thought. > > The point here is that it is a base relation, which is not > assumed to have additional columns not in its definition, > including nonsystem junk columns. I'm not sure but it seems not > that simple to give base relations an ability to have junk > columns. Do you know where that assumption is embedded specifically? If you're correct, then the FDW API is and always has been broken by design for any remote data source that uses a row identifier other than CTID, unless every foreign table definition always includes the row identifier as an explicit column. I might be wrong here, but I'm pretty sure Tom wouldn't have committed this API in the first place with such a glaring hole in the design. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
At Fri, 18 May 2018 10:19:30 +0530, Ashutosh Bapatwrote in ashutosh.bapat> On Thu, May 17, 2018 at 11:56 PM, Robert Haas wrote: > > On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapat > > wrote: > >> The second would mean that SELECT * from foreign table reports > >> remotetableoid as well, which is awkward. > > > > No it wouldn't. You'd just make the additional column resjunk, same > > as we do for wholerow. > > You suggested > -- > > I think that the place to start would be to change this code to use > > something other than TableOidAttributeNumber: > > > > + var = makeVar(parsetree->resultRelation, > > + TableOidAttributeNumber, > > + OIDOID, > > + -1, > > + InvalidOid, > > + 0); > -- > > Wholerow has its own attribute number 0, ctid has its attribute number > -1. So we can easily create Vars for those and add resjunk entries in > the targetlist. But a "remotetableoid" doesn't have an attribute > number yet! Either it has to be a new system column, which I and > almost everybody here is opposing, or it has to be a user defined > attribute, with an entry in pg_attributes table. In the second case, > how would one make that column resjunk? I don't see any third > possibility. I have reached to the same thought. The point here is that it is a base relation, which is not assumed to have additional columns not in its definition, including nonsystem junk columns. I'm not sure but it seems not that simple to give base relations an ability to have junk columns. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, May 17, 2018 at 11:56 PM, Robert Haaswrote: > On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapat > wrote: >> The second would mean that SELECT * from foreign table reports >> remotetableoid as well, which is awkward. > > No it wouldn't. You'd just make the additional column resjunk, same > as we do for wholerow. You suggested -- > I think that the place to start would be to change this code to use > something other than TableOidAttributeNumber: > > + var = makeVar(parsetree->resultRelation, > + TableOidAttributeNumber, > + OIDOID, > + -1, > + InvalidOid, > + 0); -- Wholerow has its own attribute number 0, ctid has its attribute number -1. So we can easily create Vars for those and add resjunk entries in the targetlist. But a "remotetableoid" doesn't have an attribute number yet! Either it has to be a new system column, which I and almost everybody here is opposing, or it has to be a user defined attribute, with an entry in pg_attributes table. In the second case, how would one make that column resjunk? I don't see any third possibility. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Alvaro Herrerawrites: > Can we just add a new junk attr, with its own fixed system column > number? I think that's what Robert was proposing. Junk attr yes, "fixed system column number" no. That's not how junk attrs work. What it'd need is a convention for the name of these resjunk attrs (corresponding to ctidN, wholerowN, etc). We do already have tableoidN junk attrs, but by the same token those should always be local OIDs, or we'll be in for deep confusion. Maybe "remotetableoidN" ? regards, tom lane
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Robert Haaswrites: > Yeah, but I'm not sure I like that solution very much. I don't think > abusing the tableoid to store a remote table OID is very nice. I'd say it's totally unacceptable. Tableoid *has to* be something that you can look up in the local pg_class instance, or serious confusion will ensue. regards, tom lane
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, May 17, 2018 at 2:10 AM, Ashutosh Bapatwrote: > The second would mean that SELECT * from foreign table reports > remotetableoid as well, which is awkward. No it wouldn't. You'd just make the additional column resjunk, same as we do for wholerow. > Anyway, my comment to which you have replied is obsolete now. I found > a solution to that problem, which I have implemented in 0003 in the > latest patch-set I have shared. Yeah, but I'm not sure I like that solution very much. I don't think abusing the tableoid to store a remote table OID is very nice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Wed, May 16, 2018 at 11:31 PM, Robert Haaswrote: > On Mon, Apr 16, 2018 at 7:35 AM, Ashutosh Bapat > wrote: >> It does fix the problem. But the patch as is interferes with the way >> we handle tableoid currently. That can be seen from the regression >> diffs that the patch causes. RIght now, every tableoid reference gets >> converted into the tableoid of the foreign table (and not the tableoid >> of the foreign table). Somehow we need to differentiate between the >> tableoid injected for DML and tableoid references added by the user in >> the original query and then use tableoid on the foreign server for the >> first and local foreign table's oid for the second. Right now, I don't >> see a simple way to do that. > > I think that the place to start would be to change this code to use > something other than TableOidAttributeNumber: > > + var = makeVar(parsetree->resultRelation, > + TableOidAttributeNumber, > + OIDOID, > + -1, > + InvalidOid, > + 0); > > Note that rewriteTargetListUD, which calls AddForeignUpdateTargets, > also contingently adds a "wholerow" attribute which ExecModifyTable() > is able to fish out later. It seems like it should be possible to add > a "remotetableoid" column that works similarly, although I'm not > exactly sure what would be involved. As of today, all the attributes added by AddForeignUpdateTargets hook of postgres_fdw are recognised by PostgreSQL. But remotetableoid is not a recognised attributes. In order to use it, we either have to define a new system attribute "remotetableoid" or add a user defined attribute "remotetableoid" in every foreign table. The first one will be very specific for postgres_fdw and other FDWs won't be able to use it. The second would mean that SELECT * from foreign table reports remotetableoid as well, which is awkward. Me and Horiguchi-san discussed those ideas in this mail thread. Anyway, my comment to which you have replied is obsolete now. I found a solution to that problem, which I have implemented in 0003 in the latest patch-set I have shared. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Mon, Apr 16, 2018 at 7:35 AM, Ashutosh Bapatwrote: > It does fix the problem. But the patch as is interferes with the way > we handle tableoid currently. That can be seen from the regression > diffs that the patch causes. RIght now, every tableoid reference gets > converted into the tableoid of the foreign table (and not the tableoid > of the foreign table). Somehow we need to differentiate between the > tableoid injected for DML and tableoid references added by the user in > the original query and then use tableoid on the foreign server for the > first and local foreign table's oid for the second. Right now, I don't > see a simple way to do that. I think that the place to start would be to change this code to use something other than TableOidAttributeNumber: + var = makeVar(parsetree->resultRelation, + TableOidAttributeNumber, + OIDOID, + -1, + InvalidOid, + 0); Note that rewriteTargetListUD, which calls AddForeignUpdateTargets, also contingently adds a "wholerow" attribute which ExecModifyTable() is able to fish out later. It seems like it should be possible to add a "remotetableoid" column that works similarly, although I'm not exactly sure what would be involved. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Thu, Apr 19, 2018 at 11:38 AM, Kyotaro HORIGUCHIwrote: > >> /* No rows should be returned if no rows were updated. */ >> Assert(n_rows_returned == 0 || n_rows_updated > 0); > > The assertion is correct but I think that we shouldn't crash > server by any kind of protocol error. I think ERROR is suitable. > That's a good idea. Done. >> I have attached a set of patches >> 0001 adds a test case showing the issue. >> 0002 modified patch based on your idea of throwing an error >> 0003 WIP patch with a partial fix for the issue as discussed upthread >> >> The expected output in 0001 is set to what it would when the problem >> gets fixed. The expected output in 0002 is what it would be when we >> commit only 0002 without a complete fix. >> > >> > >> >> There are two ways to fix this >> >> 1. Use WHERE CURRENT OF with cursors to update rows. This means that >> >> we fetch only one row at a time and update it. This can slow down the >> >> execution drastically. >> >> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of >> >> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions. >> >> >> >> PFA patch along the lines of 2nd approach and along with the >> >> testcases. The idea is to inject tableoid attribute to be fetched from >> >> the foreign server similar to ctid and then add it to the DML >> >> statement being constructed. >> >> >> >> It does fix the problem. But the patch as is interferes with the way >> >> we handle tableoid currently. That can be seen from the regression >> >> diffs that the patch causes. RIght now, every tableoid reference gets >> >> converted into the tableoid of the foreign table (and not the tableoid >> >> of the foreign table). Somehow we need to differentiate between the >> >> tableoid injected for DML and tableoid references added by the user in >> >> the original query and then use tableoid on the foreign server for the >> >> first and local foreign table's oid for the second. Right now, I don't >> >> see a simple way to do that. >> > >> > We cannot add no non-system (junk) columns not defined in foreign >> > table columns. >> >> Why? That's a probable way of fixing this problem. > > In other words, tuples returned from ForeignNext > (postgresIterateForeignScan) on a foreign (base) relation cannot > contain a non-system column which is not a part of the relation, > since its tuple descriptor doesn't know of and does error out it. > The current 0003 stores remote tableoid in tuples' existing > tableOid field (not a column data), which is not proper since > remote tableoid is bogus for the local server. I might missing > something here, though. If we can somehow attach an blob at the > end of t_data and it is finally passed to > ExecForeignUpdate/Delete, the problem would be resolved. Attached 0003 uses HeapTupleData::t_tableoid to store remote tableoid and local tableoid. Remote tableoid is stored there for a scan underlying DELETE/UPDATE. Local tableoid is stored otherwise. We use a flag fetch_foreign_tableoid, stand alone and in deparse_expr_cxt to differentiate between these two usages. > > I don't think it is acceptable but (hopefully) almost solves this > problem if we allow that. User always sees the conventional > tableOid and all ExecForeignUpdate/Delete have to do is to use > remote_tableoid as a part of remote tuple identifier. Required to > consider how to propagate the remote_tableoid through joins or > other intermediate executor nodes, though. It is partly similar > to the way deciding join push down. 0003 does that. Fortunately we already have testing UPDATE/DELETE with joins. > > Another point is that, even though HeapTupleData is the only > expected coveyer of the tuple identification, assuming tableoid + > ctid is not adequite since FDW interface is not exlusive for > postgres_fdw. The existig ctid field is not added for the purpose > and just happened to (seem to) work as tuple identifier for > postgres_fdw but I think tableoid is not. I am not able to understand. postgresAddForeignUpdateTargets does that specifically for postgres_fdw. I am using the same function to add junk column for tableoid similar to ctid. > > The same can be said on ctid. Maybe my description was > unclear. Specifically, I intended to say something like: > > - If we want to update/delete remote partitioned/inhtance tables > without direct modify, the foreign relation must have a columns > defined as "tableoid as remote_tableoid" or something. (We > could change the column name by a fdw option.) Ok. I think, I misunderstood your proposal. IIUC, this way, SELECT * FROM foreign_table is going to report remote_tableoid, which won't be welcome by users. Let me know what you think of the attached patches. > > >> I think we should try getting 0001 and 0002 at least committed >> independent of 0003. > > Agreed on 0002. 0001 should be committed with 0003? 0001 adds testcases which show the problem, so we have to
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
At Wed, 18 Apr 2018 13:23:06 +0530, Ashutosh Bapatwrote in > On Wed, Apr 18, 2018 at 9:43 AM, Kyotaro HORIGUCHI > wrote: > > > > Anyway I think we should warn or error out if one nondirect > > update touches two nor more tuples in the first place. > > > > =# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE > > a = 1; > > ERROR: updated 2 rows for a tuple identity on the remote end > > I liked that idea. But I think your patch wasn't quite right, esp. > when the returning had an SRF in it. Right now n_rows tracks the > number of rows returned if there is a returning list or the number of > rows updated/deleted on the foreign server. If there is an SRF, n_rows > can return multiple rows for a single updated or deleted row. So, I > changed your code to track number of rows updated/deleted and number > of rows returned separately. BTW, your patch didn't handle DELETE > case. Yeah, sorry. It was to just show how the error looks like. Attached 0002 works and looks fine except the following. > /* No rows should be returned if no rows were updated. */ > Assert(n_rows_returned == 0 || n_rows_updated > 0); The assertion is correct but I think that we shouldn't crash server by any kind of protocol error. I think ERROR is suitable. > I have attached a set of patches > 0001 adds a test case showing the issue. > 0002 modified patch based on your idea of throwing an error > 0003 WIP patch with a partial fix for the issue as discussed upthread > > The expected output in 0001 is set to what it would when the problem > gets fixed. The expected output in 0002 is what it would be when we > commit only 0002 without a complete fix. > > > > > >> There are two ways to fix this > >> 1. Use WHERE CURRENT OF with cursors to update rows. This means that > >> we fetch only one row at a time and update it. This can slow down the > >> execution drastically. > >> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of > >> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions. > >> > >> PFA patch along the lines of 2nd approach and along with the > >> testcases. The idea is to inject tableoid attribute to be fetched from > >> the foreign server similar to ctid and then add it to the DML > >> statement being constructed. > >> > >> It does fix the problem. But the patch as is interferes with the way > >> we handle tableoid currently. That can be seen from the regression > >> diffs that the patch causes. RIght now, every tableoid reference gets > >> converted into the tableoid of the foreign table (and not the tableoid > >> of the foreign table). Somehow we need to differentiate between the > >> tableoid injected for DML and tableoid references added by the user in > >> the original query and then use tableoid on the foreign server for the > >> first and local foreign table's oid for the second. Right now, I don't > >> see a simple way to do that. > > > > We cannot add no non-system (junk) columns not defined in foreign > > table columns. > > Why? That's a probable way of fixing this problem. In other words, tuples returned from ForeignNext (postgresIterateForeignScan) on a foreign (base) relation cannot contain a non-system column which is not a part of the relation, since its tuple descriptor doesn't know of and does error out it. The current 0003 stores remote tableoid in tuples' existing tableOid field (not a column data), which is not proper since remote tableoid is bogus for the local server. I might missing something here, though. If we can somehow attach an blob at the end of t_data and it is finally passed to ExecForeignUpdate/Delete, the problem would be resolved. > > We could pass tableoid via a side channel but we > > get wrong value if the scan is not consists of only one foreign > > relation. I don't think adding remote_tableoid in HeapTupleData > > is acceptable. > > I am thinking of adding remote_tableoid in HeapTupleData since not all > FDWs will have the concept of tableoid. But we need to somehow > distinguish the tableoid resjunk added for DMLs and tableoid requested > by the user. I don't think it is acceptable but (hopefully) almost solves this problem if we allow that. User always sees the conventional tableOid and all ExecForeignUpdate/Delete have to do is to use remote_tableoid as a part of remote tuple identifier. Required to consider how to propagate the remote_tableoid through joins or other intermediate executor nodes, though. It is partly similar to the way deciding join push down. Another point is that, even though HeapTupleData is the only expected coveyer of the tuple identification, assuming tableoid + ctid is not adequite since FDW interface is not exlusive for postgres_fdw. The existig ctid field is not added for the purpose and just happened to (seem to) work as tuple identifier for postgres_fdw but I
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
On Wed, Apr 18, 2018 at 9:43 AM, Kyotaro HORIGUCHIwrote: > > Anyway I think we should warn or error out if one nondirect > update touches two nor more tuples in the first place. > > =# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a > = 1; > ERROR: updated 2 rows for a tuple identity on the remote end I liked that idea. But I think your patch wasn't quite right, esp. when the returning had an SRF in it. Right now n_rows tracks the number of rows returned if there is a returning list or the number of rows updated/deleted on the foreign server. If there is an SRF, n_rows can return multiple rows for a single updated or deleted row. So, I changed your code to track number of rows updated/deleted and number of rows returned separately. BTW, your patch didn't handle DELETE case. I have attached a set of patches 0001 adds a test case showing the issue. 0002 modified patch based on your idea of throwing an error 0003 WIP patch with a partial fix for the issue as discussed upthread The expected output in 0001 is set to what it would when the problem gets fixed. The expected output in 0002 is what it would be when we commit only 0002 without a complete fix. > > >> There are two ways to fix this >> 1. Use WHERE CURRENT OF with cursors to update rows. This means that >> we fetch only one row at a time and update it. This can slow down the >> execution drastically. >> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of >> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions. >> >> PFA patch along the lines of 2nd approach and along with the >> testcases. The idea is to inject tableoid attribute to be fetched from >> the foreign server similar to ctid and then add it to the DML >> statement being constructed. >> >> It does fix the problem. But the patch as is interferes with the way >> we handle tableoid currently. That can be seen from the regression >> diffs that the patch causes. RIght now, every tableoid reference gets >> converted into the tableoid of the foreign table (and not the tableoid >> of the foreign table). Somehow we need to differentiate between the >> tableoid injected for DML and tableoid references added by the user in >> the original query and then use tableoid on the foreign server for the >> first and local foreign table's oid for the second. Right now, I don't >> see a simple way to do that. > > We cannot add no non-system (junk) columns not defined in foreign > table columns. Why? That's a probable way of fixing this problem. > We could pass tableoid via a side channel but we > get wrong value if the scan is not consists of only one foreign > relation. I don't think adding remote_tableoid in HeapTupleData > is acceptable. I am thinking of adding remote_tableoid in HeapTupleData since not all FDWs will have the concept of tableoid. But we need to somehow distinguish the tableoid resjunk added for DMLs and tableoid requested by the user. > Explicity defining remote_tableoid column in > foreign relation might work but it makes things combersome.. > Not just cumbersome, it's not going to be always right, if the things change on the foreign server e.g. OID of the table changes because it got dropped and recreated on the foreign server or OID remained same but the table got inherited and so on. I think we should try getting 0001 and 0002 at least committed independent of 0003. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company From 1a8fc73fffa522e10a831bb9e6557a9fb4b0b602 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat Date: Wed, 18 Apr 2018 10:20:27 +0530 Subject: [PATCH 1/3] Tests to show problem when foreign table points to a partitioned table or inheritance table on the foreign server When a foreign table points to a partitioned table or an inheritance parent on the foreign server, a non-direct DML can affect multiple rows when only one row is intended to be affected. This happens because postgres_fdw uses only ctid to identify a row to work on. Though ctid uniquely identifies a row in a single table, in a partitioned table or in an inheritance hierarchy, there can be be multiple rows, in different partitions, with the same ctid. So DML statement sent to the foreign server by postgres_fdw ends up affecting more than one rows, only one of which is intended to be affected. This commit adds testcases to show the problem. A subsequent commit would have a fix to the problem. Ashutosh Bapat, reviewed by Kyotaro Horiguchi --- contrib/postgres_fdw/expected/postgres_fdw.out | 121 contrib/postgres_fdw/sql/postgres_fdw.sql | 53 +++ 2 files changed, 174 insertions(+) diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index e4d9469..5156002 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++
Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Hello. At Mon, 16 Apr 2018 17:05:28 +0530, Ashutosh Bapatwrote in > Hi, > Consider this scenario > > postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a); > postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1); > postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2); > postgres=# INSERT INTO plt VALUES (1, 1), (2, 2); > postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback > OPTIONS (table_name 'plt'); > postgres=# SELECT tableoid::regclass, ctid, * FROM fplt; > tableoid | ctid | a | b > --+---+---+--- > fplt | (0,1) | 1 | 1 > fplt | (0,1) | 2 | 2 > (2 rows) > > -- Need to use random() so that following update doesn't turn into a > direct UPDATE. > postgres=# EXPLAIN (VERBOSE, COSTS OFF) > postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE > 20 END) WHERE a = 1; > QUERY PLAN > > Update on public.fplt >Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1 >-> Foreign Scan on public.fplt > Output: a, CASE WHEN (random() <= '1'::double precision) THEN > 10 ELSE 20 END, ctid > Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE > (5 rows) > > postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE > 20 END) WHERE a = 1; > postgres=# SELECT tableoid::regclass, ctid, * FROM fplt; > tableoid | ctid | a | b > --+---+---+ > fplt | (0,2) | 1 | 10 > fplt | (0,2) | 2 | 10 > (2 rows) > > We expect only 1 row with a = 1 to be updated, but both the rows get > updated. This happens because both the rows has ctid = (0, 1) and > that's the only qualification used for UPDATE and DELETE. Thus when a > non-direct UPDATE is run on a foreign table which points to a > partitioned table or inheritance hierarchy on the foreign server, it > will update rows from all child table which have ctids same as the > qualifying rows. Same is the case with DELETE. Anyway I think we should warn or error out if one nondirect update touches two nor more tuples in the first place. =# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1; ERROR: updated 2 rows for a tuple identity on the remote end > There are two ways to fix this > 1. Use WHERE CURRENT OF with cursors to update rows. This means that > we fetch only one row at a time and update it. This can slow down the > execution drastically. > 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of > UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions. > > PFA patch along the lines of 2nd approach and along with the > testcases. The idea is to inject tableoid attribute to be fetched from > the foreign server similar to ctid and then add it to the DML > statement being constructed. > > It does fix the problem. But the patch as is interferes with the way > we handle tableoid currently. That can be seen from the regression > diffs that the patch causes. RIght now, every tableoid reference gets > converted into the tableoid of the foreign table (and not the tableoid > of the foreign table). Somehow we need to differentiate between the > tableoid injected for DML and tableoid references added by the user in > the original query and then use tableoid on the foreign server for the > first and local foreign table's oid for the second. Right now, I don't > see a simple way to do that. We cannot add no non-system (junk) columns not defined in foreign table columns. We could pass tableoid via a side channel but we get wrong value if the scan is not consists of only one foreign relation. I don't think adding remote_tableoid in HeapTupleData is acceptable. Explicity defining remote_tableoid column in foreign relation might work but it makes things combersome.. regards, -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index e1c2639fde..7cd31cb6ab 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -1895,6 +1895,13 @@ postgresExecForeignUpdate(EState *estate, MemoryContextReset(fmstate->temp_cxt); + /* ERROR if more than one row was updated on the remote end */ + if (n_rows > 1) + ereport(ERROR, +(errcode (ERRCODE_FDW_ERROR), /* XXX */ + errmsg ("updated %d rows for a tuple identity on the remote end", + n_rows))); + /* Return NULL if nothing was updated on the remote end */ return (n_rows > 0) ? slot : NULL; }
Problem while updating a foreign table pointing to a partitioned table on foreign server
Hi, Consider this scenario postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a); postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1); postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2); postgres=# INSERT INTO plt VALUES (1, 1), (2, 2); postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback OPTIONS (table_name 'plt'); postgres=# SELECT tableoid::regclass, ctid, * FROM fplt; tableoid | ctid | a | b --+---+---+--- fplt | (0,1) | 1 | 1 fplt | (0,1) | 2 | 2 (2 rows) -- Need to use random() so that following update doesn't turn into a direct UPDATE. postgres=# EXPLAIN (VERBOSE, COSTS OFF) postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1; QUERY PLAN Update on public.fplt Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1 -> Foreign Scan on public.fplt Output: a, CASE WHEN (random() <= '1'::double precision) THEN 10 ELSE 20 END, ctid Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE (5 rows) postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1; postgres=# SELECT tableoid::regclass, ctid, * FROM fplt; tableoid | ctid | a | b --+---+---+ fplt | (0,2) | 1 | 10 fplt | (0,2) | 2 | 10 (2 rows) We expect only 1 row with a = 1 to be updated, but both the rows get updated. This happens because both the rows has ctid = (0, 1) and that's the only qualification used for UPDATE and DELETE. Thus when a non-direct UPDATE is run on a foreign table which points to a partitioned table or inheritance hierarchy on the foreign server, it will update rows from all child table which have ctids same as the qualifying rows. Same is the case with DELETE. There are two ways to fix this 1. Use WHERE CURRENT OF with cursors to update rows. This means that we fetch only one row at a time and update it. This can slow down the execution drastically. 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions. PFA patch along the lines of 2nd approach and along with the testcases. The idea is to inject tableoid attribute to be fetched from the foreign server similar to ctid and then add it to the DML statement being constructed. It does fix the problem. But the patch as is interferes with the way we handle tableoid currently. That can be seen from the regression diffs that the patch causes. RIght now, every tableoid reference gets converted into the tableoid of the foreign table (and not the tableoid of the foreign table). Somehow we need to differentiate between the tableoid injected for DML and tableoid references added by the user in the original query and then use tableoid on the foreign server for the first and local foreign table's oid for the second. Right now, I don't see a simple way to do that. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 6e2fa14..d3c98d3 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -1144,7 +1144,7 @@ deparseTargetList(StringInfo buf, } /* - * Add ctid and oid if needed. We currently don't support retrieving any + * Add ctid, tableoid and oid if needed. We currently don't support retrieving any * other system columns. */ if (bms_is_member(SelfItemPointerAttributeNumber - FirstLowInvalidHeapAttributeNumber, @@ -1179,6 +1179,22 @@ deparseTargetList(StringInfo buf, *retrieved_attrs = lappend_int(*retrieved_attrs, ObjectIdAttributeNumber); } + if (bms_is_member(TableOidAttributeNumber - FirstLowInvalidHeapAttributeNumber, + attrs_used)) + { + if (!first) + appendStringInfoString(buf, ", "); + else if (is_returning) + appendStringInfoString(buf, " RETURNING "); + first = false; + + if (qualify_col) + ADD_REL_QUALIFIER(buf, rtindex); + appendStringInfoString(buf, "tableoid"); + + *retrieved_attrs = lappend_int(*retrieved_attrs, + TableOidAttributeNumber); + } /* Don't generate bad syntax if no undropped columns */ if (first && !is_returning) @@ -1725,7 +1741,7 @@ deparseUpdateSql(StringInfo buf, PlannerInfo *root, deparseRelation(buf, rel); appendStringInfoString(buf, " SET "); - pindex = 2; /* ctid is always the first param */ + pindex = 3; /* ctid, tableoid params appear first */ first = true; foreach(lc, targetAttrs) { @@ -1739,7 +1755,7 @@ deparseUpdateSql(StringInfo buf, PlannerInfo *root, appendStringInfo(buf, " = $%d", pindex); pindex++; } - appendStringInfoString(buf, " WHERE ctid = $1"); + appendStringInfoString(buf, " WHERE ctid = $1 AND tableoid = $2"); deparseReturningList(buf, root, rtindex, rel,