Re: bug in update tuple routing with foreign partitions

Amit Langote Thu, 18 Apr 2019 21:01:16 -0700

Fujita-san,

On 2019/04/18 22:10, Etsuro Fujita wrote:
>> On 2019/04/18 14:06, Amit Langote wrote:
>>> On 2019/04/17 21:49, Etsuro Fujita wrote:
>>>> So what I'm thinking is to throw an error for cases like this. 
>>>> (Though, I
>>>> think we should keep to allow rows to be moved from local partitions to a
>>>> foreign-table subplan targetrel that has been updated already.)
>>>
>>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
>>> two cases?
> 
> One thing I came up with to do that is this:
> 
> @@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
> 
>         initStringInfo(&sql);
> 
> +       /*
> +        * If the foreign table is an UPDATE subplan resultrel that hasn't
> yet
> +        * been updated, routing tuples to the table might yield incorrect
> +        * results, because if routing tuples, routed tuples will be
> mistakenly
> +        * read from the table and updated twice when updating the table
> --- it
> +        * would be nice if we could handle this case; but for now, throw
> an error
> +        * for safety.
> +        */
> +       if (plan && plan->operation == CMD_UPDATE &&
> +               (resultRelInfo->ri_usesFdwDirectModify ||
> +                resultRelInfo->ri_FdwState) &&
> +               resultRelInfo > mtstate->resultRelInfo +
> mtstate->mt_whichplan)
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                                errmsg("cannot route tuples into foreign
> table to be updated \"%s\"",
> + RelationGetRelationName(rel))));


Oh, I see.

So IIUC, you're making postgresBeginForeignInsert() check two things:

1. Whether the foreign table it's about to begin inserting (moved) rows
into is a subplan result rel, checked by
(resultRelInfo->ri_usesFdwDirectModify || resultRelInfo->ri_FdwState)

2. Whether the foreign table it's about to begin inserting (moved) rows
into will be updated later, checked by (resultRelInfo >
mtstate->resultRelInfo + mtstate->mt_whichplan)

This still allows a foreign table to receive rows moved from the local
partitions if it has already finished the UPDATE operation.

Seems reasonable to me.

>> Forgot to say that since this is a separate issue from the original bug
>> report, maybe we can first finish fixing the latter.  What to do you think?
> 
> Yeah, I think we can do that, but my favorite would be to fix the two
> issues in a single shot, because they seem to me rather close to each
> other.  So I updated a new version in a single patch, which I'm attaching.

I agree that we can move to fix the other issue right away as the fix you
outlined above seems reasonable, but I wonder if it wouldn't be better to
commit the two fixes separately?  The two fixes, although small, are
somewhat complicated and combining them in a single commit might be
confusing.  Also, a combined commit might make it harder for the release
note author to list down the exact set of problems being fixed.  But I
guess your commit message will make it clear that two distinct problems
are being solved, so maybe that shouldn't be a problem.

> Notes:
> 
> * I kept all the changes in the previous patch, because otherwise
> postgres_fdw would fail to release resources for a foreign-insert
> operation created by postgresBeginForeignInsert() for a tuple-routable
> foreign table (ie, a foreign-table subplan resultrel that has been updated
> already) during postgresEndForeignInsert().

Hmm are you saying that the cases for which we'll still allow tuple
routing (foreign table receiving moved-in rows has already been updated),
there will be two fmstates to be released -- the original fmstate
(UPDATE's) and aux_fmstate (INSERT's)?

> * I revised a comment according to your previous comment, though I changed
> a state name: s/sub_fmstate/aux_fmstate/

That seems fine to me.

> * DirectModify also has the latter issue.  Here is an example:
> 
> postgres=# create table p (a int, b text) partition by list (a);
> postgres=# create table p1 partition of p for values in (1);
> postgres=# create table p2base (a int check (a = 2 or a = 3), b text);
> postgres=# create foreign table p2 partition of p for values in (2, 3)
> server loopback options (table_name 'p2base');
> 
> postgres=# insert into p values (1, 'foo');
> INSERT 0 1
> postgres=# explain verbose update p set a = a + 1;
>                                  QUERY PLAN
> -----------------------------------------------------------------------------
>  Update on public.p  (cost=0.00..176.21 rows=2511 width=42)
>    Update on public.p1
>    Foreign Update on public.p2
>    ->  Seq Scan on public.p1  (cost=0.00..25.88 rows=1270 width=42)
>          Output: (p1.a + 1), p1.b, p1.ctid
>    ->  Foreign Update on public.p2  (cost=100.00..150.33 rows=1241 width=42)
>          Remote SQL: UPDATE public.p2base SET a = (a + 1)
> (7 rows)
> 
> postgres=# update p set a = a + 1;
> UPDATE 2
> postgres=# select * from p;
>  a |  b
> ---+-----
>  3 | foo
> (1 row)

Ah, the expected out is "(2, foo)".  Also, with RETURNING, you'd get this:

update p set a = a + 1 returning tableoid::regclass, *;
 tableoid │ a │  b
──────────┼───┼─────
 p2       │ 2 │ foo
 p2       │ 3 │ foo
(2 rows)

> As shown in the above bit added to postgresBeginForeignInsert(), I
> modified the patch so that we throw an error for cases like this even when
> using a direct modification plan, and I also added regression test cases
> for that.

Thanks for adding detailed tests.

Some mostly cosmetic comments on the code changes:

* In the following comment:

+    /*
+     * If the foreign table is an UPDATE subplan resultrel that hasn't yet
+     * been updated, routing tuples to the table might yield incorrect
+     * results, because if routing tuples, routed tuples will be mistakenly
+     * read from the table and updated twice when updating the table --- it
+     * would be nice if we could handle this case; but for now, throw an
error
+     * for safety.
+     */

the part that start with "because if routing tuples..." reads a bit
unclear to me. How about writing this as:

    /*
     * If the foreign table we are about to insert routed rows into is
     * also an UPDATE result rel and the UPDATE hasn't been performed yet,
     * proceeding with the INSERT will result in the later UPDATE
     * incorrectly modifying those routed rows, so prevent the INSERT ---
     * it would be nice if we could handle this case, but for now, throw
     * an error for safety.
     */

I see that in all the hunks involving some manipulation of aux_fmstate,
there's a comment explaining what it is, which seems a bit repetitive.  I
can see more or less the same explanation in postgresExecForeignInsert(),
postgresBeginForeignInsert(), and postgresEndForeignInsert().  Maybe just
keep the description in postgresBeginForeignInsert as follows:

@@ -1983,7 +2020,19 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
                                     retrieved_attrs != NIL,
                                     retrieved_attrs);

-    resultRelInfo->ri_FdwState = fmstate;
+    /*
+     * If the given resultRelInfo already has PgFdwModifyState set, it means
+     * the foreign table is an UPDATE subplan resultrel; in which case, store
+     * the resulting state into the aux_fmstate of the PgFdwModifyState.
+     */

and change the other sites to refer to postgresBeginForeingInsert for the
detailed explanation of what aux_fmstate is.

BTW, do you think we should update the docs regarding the newly
established limitation of row movement involving foreign tables?

Thanks,
Amit

Re: bug in update tuple routing with foreign partitions

Reply via email to