Fujita-san, On 2019/04/18 22:10, Etsuro Fujita wrote: >> On 2019/04/18 14:06, Amit Langote wrote: >>> On 2019/04/17 21:49, Etsuro Fujita wrote: >>>> So what I'm thinking is to throw an error for cases like this. >>>> (Though, I >>>> think we should keep to allow rows to be moved from local partitions to a >>>> foreign-table subplan targetrel that has been updated already.) >>> >>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the >>> two cases? > > One thing I came up with to do that is this: > > @@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate, > > initStringInfo(&sql); > > + /* > + * If the foreign table is an UPDATE subplan resultrel that hasn't > yet > + * been updated, routing tuples to the table might yield incorrect > + * results, because if routing tuples, routed tuples will be > mistakenly > + * read from the table and updated twice when updating the table > --- it > + * would be nice if we could handle this case; but for now, throw > an error > + * for safety. > + */ > + if (plan && plan->operation == CMD_UPDATE && > + (resultRelInfo->ri_usesFdwDirectModify || > + resultRelInfo->ri_FdwState) && > + resultRelInfo > mtstate->resultRelInfo + > mtstate->mt_whichplan) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot route tuples into foreign > table to be updated \"%s\"", > + RelationGetRelationName(rel))));
Oh, I see. So IIUC, you're making postgresBeginForeignInsert() check two things: 1. Whether the foreign table it's about to begin inserting (moved) rows into is a subplan result rel, checked by (resultRelInfo->ri_usesFdwDirectModify || resultRelInfo->ri_FdwState) 2. Whether the foreign table it's about to begin inserting (moved) rows into will be updated later, checked by (resultRelInfo > mtstate->resultRelInfo + mtstate->mt_whichplan) This still allows a foreign table to receive rows moved from the local partitions if it has already finished the UPDATE operation. Seems reasonable to me. >> Forgot to say that since this is a separate issue from the original bug >> report, maybe we can first finish fixing the latter. What to do you think? > > Yeah, I think we can do that, but my favorite would be to fix the two > issues in a single shot, because they seem to me rather close to each > other. So I updated a new version in a single patch, which I'm attaching. I agree that we can move to fix the other issue right away as the fix you outlined above seems reasonable, but I wonder if it wouldn't be better to commit the two fixes separately? The two fixes, although small, are somewhat complicated and combining them in a single commit might be confusing. Also, a combined commit might make it harder for the release note author to list down the exact set of problems being fixed. But I guess your commit message will make it clear that two distinct problems are being solved, so maybe that shouldn't be a problem. > Notes: > > * I kept all the changes in the previous patch, because otherwise > postgres_fdw would fail to release resources for a foreign-insert > operation created by postgresBeginForeignInsert() for a tuple-routable > foreign table (ie, a foreign-table subplan resultrel that has been updated > already) during postgresEndForeignInsert(). Hmm are you saying that the cases for which we'll still allow tuple routing (foreign table receiving moved-in rows has already been updated), there will be two fmstates to be released -- the original fmstate (UPDATE's) and aux_fmstate (INSERT's)? > * I revised a comment according to your previous comment, though I changed > a state name: s/sub_fmstate/aux_fmstate/ That seems fine to me. > * DirectModify also has the latter issue. Here is an example: > > postgres=# create table p (a int, b text) partition by list (a); > postgres=# create table p1 partition of p for values in (1); > postgres=# create table p2base (a int check (a = 2 or a = 3), b text); > postgres=# create foreign table p2 partition of p for values in (2, 3) > server loopback options (table_name 'p2base'); > > postgres=# insert into p values (1, 'foo'); > INSERT 0 1 > postgres=# explain verbose update p set a = a + 1; > QUERY PLAN > ----------------------------------------------------------------------------- > Update on public.p (cost=0.00..176.21 rows=2511 width=42) > Update on public.p1 > Foreign Update on public.p2 > -> Seq Scan on public.p1 (cost=0.00..25.88 rows=1270 width=42) > Output: (p1.a + 1), p1.b, p1.ctid > -> Foreign Update on public.p2 (cost=100.00..150.33 rows=1241 width=42) > Remote SQL: UPDATE public.p2base SET a = (a + 1) > (7 rows) > > postgres=# update p set a = a + 1; > UPDATE 2 > postgres=# select * from p; > a | b > ---+----- > 3 | foo > (1 row) Ah, the expected out is "(2, foo)". Also, with RETURNING, you'd get this: update p set a = a + 1 returning tableoid::regclass, *; tableoid │ a │ b ──────────┼───┼───── p2 │ 2 │ foo p2 │ 3 │ foo (2 rows) > As shown in the above bit added to postgresBeginForeignInsert(), I > modified the patch so that we throw an error for cases like this even when > using a direct modification plan, and I also added regression test cases > for that. Thanks for adding detailed tests. Some mostly cosmetic comments on the code changes: * In the following comment: + /* + * If the foreign table is an UPDATE subplan resultrel that hasn't yet + * been updated, routing tuples to the table might yield incorrect + * results, because if routing tuples, routed tuples will be mistakenly + * read from the table and updated twice when updating the table --- it + * would be nice if we could handle this case; but for now, throw an error + * for safety. + */ the part that start with "because if routing tuples..." reads a bit unclear to me. How about writing this as: /* * If the foreign table we are about to insert routed rows into is * also an UPDATE result rel and the UPDATE hasn't been performed yet, * proceeding with the INSERT will result in the later UPDATE * incorrectly modifying those routed rows, so prevent the INSERT --- * it would be nice if we could handle this case, but for now, throw * an error for safety. */ I see that in all the hunks involving some manipulation of aux_fmstate, there's a comment explaining what it is, which seems a bit repetitive. I can see more or less the same explanation in postgresExecForeignInsert(), postgresBeginForeignInsert(), and postgresEndForeignInsert(). Maybe just keep the description in postgresBeginForeignInsert as follows: @@ -1983,7 +2020,19 @@ postgresBeginForeignInsert(ModifyTableState *mtstate, retrieved_attrs != NIL, retrieved_attrs); - resultRelInfo->ri_FdwState = fmstate; + /* + * If the given resultRelInfo already has PgFdwModifyState set, it means + * the foreign table is an UPDATE subplan resultrel; in which case, store + * the resulting state into the aux_fmstate of the PgFdwModifyState. + */ and change the other sites to refer to postgresBeginForeingInsert for the detailed explanation of what aux_fmstate is. BTW, do you think we should update the docs regarding the newly established limitation of row movement involving foreign tables? Thanks, Amit