Ah.. I understood that what you mentioned is the lack of local recheck of foreigh tuples. Sorry for the noise.
At Wed, 14 Oct 2015 17:31:16 +0900, Etsuro Fujita <fujita.ets...@lab.ntt.co.jp> wrote in <561e12d4.7040...@lab.ntt.co.jp> On 2015/10/10 10:17, Robert Haas wrote: > > On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita > > <fujita.ets...@lab.ntt.co.jp> wrote: > >>> The best plan is presumably something like this as you said before: > >>> > >>> LockRows > >>> -> Nested Loop > >>> -> Seq Scan on verysmall v > >>> -> Foreign Scan on bigft1 and bigft2 > >>> Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x = > >>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2 > >>> > >>> Consider the EvalPlanQual testing to see if the updated version of a > >>> tuple in v satisfies the query. If we use the column in the testing, > >>> we > >>> would get the wrong results in some cases. > > >> More precisely, we would get the wrong result when the value of v.q or > >> v.r > >> in the updated version has changed. > > > Interesting test case. It's worth considering why this works if you > > were to replace the Foreign Scan with an Index Scan; suppose the query > > is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x = > > t.x FOR UPDATE OF v, so that you get: > > > > LockRows > > -> Nested Loop > > -> Seq Scan on verysmall v > > -> Foreign Scan on realbiglocaltable t > > Index Cond: v.x = t.x > > > > In your example, the remote SQL pushes down certain quals to the > > remote server, and so if we just return the same tuple they might no > > longer be satisfied. In this example, the index qual is essentially a > > filter condition that has been "pushed down" into the index AM. The > > EvalPlanQual machinery prevents this from generating wrong answers by > > rechecking the index cond - see IndexRecheck. Even though it's > > normally the AM's job to enforce the index cond, and the executor does > > not need to recheck, in the EvalPlanQual case it does need to recheck. > > > > I think the foreign data wrapper case should be handled the same way. > > Any condition that we initially pushed down to the foreign server > > needs to be locally rechecked if we're inside EPQ. > > Agreed. > > As KaiGai-san also pointed out before, I think we should address this > in each of the following cases: > > 1) remote qual (scanrelid>0) > 2) remote join (scanrelid==0) > > As for #1, I noticed that there is a bug in handling the same kind of > FDW queries, which will be shown below. As you said, I think this > should be addressed by rechecking the remote quals *locally*. (I > thought another fix for this kind of bug before, though.) IIUC, I > think this should be fixed separately from #2, as this is a bug not > only in 9.5, but in back branches. Please find attached a patch. > > Create an environment: > > mydatabase=# create table t1 (a int primary key, b text); > mydatabase=# insert into t1 select a, 'notsolongtext' from > generate_series(1, 1000000) a; > > postgres=# create server myserver foreign data wrapper postgres_fdw > options (dbname 'mydatabase'); > postgres=# create user mapping for current_user server myserver; > postgres=# create foreign table ft1 (a int, b text) server myserver > options (table_name 't1'); > postgres=# alter foreign table ft1 options (add use_remote_estimate > 'true'); > postgres=# create table inttab (a int); > postgres=# insert into inttab select a from generate_series(1, 10) a; > postgres=# analyze ft1; > postgres=# analyze inttab; > > Run concurrent transactions that produce incorrect result: > > [Terminal1] > postgres=# begin; > BEGIN > postgres=# update inttab set a = a + 1 where a = 1; > UPDATE 1 > > [Terminal2] > postgres=# explain verbose select * from inttab, ft1 where inttab.a = > ft1.a limit 1 for update; > QUERY PLAN > ------------------------------------------------------------------------------------------------- > Limit (cost=100.43..198.99 rows=1 width=70) > Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* > -> LockRows (cost=100.43..1086.00 rows=10 width=70) > Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* > -> Nested Loop (cost=100.43..1085.90 rows=10 width=70) > Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* > -> Seq Scan on public.inttab (cost=0.00..1.10 rows=10 > -> width=10) > Output: inttab.a, inttab.ctid > -> Foreign Scan on public.ft1 (cost=100.43..108.47 rows=1 > -> width=18) > Output: ft1.a, ft1.b, ft1.* > Remote SQL: SELECT a, b FROM public.t1 WHERE > (($1::integer = a)) FOR UPDATE > (11 rows) > > postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1 > for update; > > [Terminal1] > postgres=# commit; > COMMIT > > [Terminal2] > (After the commit in Terminal1, the following result will be shown in > Terminal2. Note that the values of inttab.a and ft1.a wouldn't > satisfy the remote qual!) > a | a | b > ---+---+--------------- > 2 | 1 | notsolongtext > (1 row) > > As for #2, I didn't come up with any solution to locally rechecking > pushed-down join conditions against a joined tuple populated from a > column that we discussed. Instead, I'd like to revise a > local-join-execution-plan-based approach that we discussed before, by > addressing your comments such as [1]. Would it be the right way to > go? > > Best regards, > Etsuro Fujita > > [1] > http://www.postgresql.org/message-id/ca+tgmoaazs0dr23r7ptbseqfwotuvcpnbqdhxebo9gi+dmx...@mail.gmail.com -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers