(2019/04/11 14:33), Amit Langote wrote:
On 2019/04/10 17:38, Etsuro Fujita wrote:
(2019/03/06 18:33), Amit Langote wrote:
I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
to use for partition routing targets.  Specifically, the bug occurs when
UPDATE targets include a foreign partition that is locally modified (as
opposed to being modified directly on the remove server) and its
ResultRelInfo (called subplan result rel in the source code) is reused for
tuple routing:

The solution is to teach ExecSetupPartitionTupleRouting to avoid using a
subplan result rel if its ri_FdwState is already set.

I'm not sure that is a good idea, because that requires to create a new
ResultRelInfo, which is not free; I think it requires a lot of work.

After considering this a bit more and studying your proposed solution, I
tend to agree.  Beside the performance considerations you point out that I
too think are valid, I realize that going with my approach would mean
embedding some assumptions in the core code about ri_FdwState; in this
case, assuming that a subplan result rel's ri_FdwState is not to be
re-purposed for tuple routing insert, which is only based on looking at
postgres_fdw's implementation.

Yeah, that assumption might not apply to some other FDWs.

Not to mention the ensuing complexity in
the core tuple routing code -- it now needs to account for the fact that
not all subplan result rels may have been re-purposed for tuple-routing,
something that's too complicated to handle in PG 11.

I think so too.

IOW, let's call this a postgres_fdw bug and fix it there as you propose.

Agreed.

Another solution to avoid that would be to store the fmstate created by
postgresBeginForeignInsert() into the ri_FdwState, not overwrite the
ri_FdwState, like the attached.  This would not need any changes to the
core, and this would not cause any overheads either, IIUC.  What do you
think about that?

+1.  Just one comment:

+    /*
+     * If the given resultRelInfo already has PgFdwModifyState set, it means
+     * the foreign table is an UPDATE subplan resultrel; in which case, store
+     * the resulting state into the PgFdwModifyState.
+     */

The last "PgFdwModifyState" should be something else?  Maybe,
sub-PgModifyState or sub_fmstate?

OK, will revise.  My favorite would be the latter.

I've also added a
test in postgres_fdw.sql to exercise this test case.

Thanks again, but the test case you added works well without any fix:

Oops, I should really adopt a habit of adding a test case before code.

+-- Check case where the foreign partition is a subplan target rel and
+-- foreign parttion is locally modified (target table being joined
+-- locally prevents a direct/remote modification plan).
+explain (verbose, costs off)
+update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x
returning *;
+                                  QUERY PLAN
+------------------------------------------------------------------------------

+ Update on public.utrtest
+   Output: remp.a, remp.b, "*VALUES*".column1
+   Foreign Update on public.remp
+     Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING
a, b
+   Update on public.locp
+   ->   Hash Join
+         Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
+         Hash Cond: (remp.a = "*VALUES*".column1)
+         ->   Foreign Scan on public.remp
+               Output: remp.b, remp.ctid, remp.a
+               Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
+         ->   Hash
+               Output: "*VALUES*".*, "*VALUES*".column1
+               ->   Values Scan on "*VALUES*"
+                     Output: "*VALUES*".*, "*VALUES*".column1
+   ->   Hash Join
+         Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
+         Hash Cond: (locp.a = "*VALUES*".column1)
+         ->   Seq Scan on public.locp
+               Output: locp.b, locp.ctid, locp.a
+         ->   Hash
+               Output: "*VALUES*".*, "*VALUES*".column1
+               ->   Values Scan on "*VALUES*"
+                     Output: "*VALUES*".*, "*VALUES*".column1
+(24 rows)

In this test case, the foreign partition is updated before ri_FdwState is
overwritten by postgresBeginForeignInsert() that's invoked by the tuple
routing code, so it would work without any fix.  So I modified this test
case as such.

Thanks for fixing that.   I hadn't noticed that foreign partition remp is
updated before local partition locp; should've added a locp0 for values
(0) so that remp appears second in the ModifyTable's subplans.  Or, well,
make the foreign table remp appear later by making it a partition for
values in (3) as your patch does.

BTW, have you noticed that the RETURNING clause returns the same row twice?

I noticed this, but I didn't think it hard.  :(

+update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
returning *;
+ a |         b         | x
+---+-------------------+---
+ 3 | qux triggered !   | 2
+ 3 | xyzzy triggered ! | 3
+ 3 | qux triggered !   | 3
+(3 rows)

You can see that the row that's moved into remp is returned twice (one
with "qux triggered!" in b column), whereas it should've been only once?

Yeah, this is unexpected behavior, so will look into this. Thanks for pointing that out!

Best regards,
Etsuro Fujita



Reply via email to