Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

Kyotaro HORIGUCHI Wed, 14 Oct 2015 01:56:18 -0700

Ah.. 

I understood that what you mentioned is the lack of local recheck
of foreigh tuples. Sorry for the noise.



At Wed, 14 Oct 2015 17:31:16 +0900, Etsuro Fujita <fujita.ets...@lab.ntt.co.jp> 
wrote in <561e12d4.7040...@lab.ntt.co.jp>
On 2015/10/10 10:17, Robert Haas wrote:
> > On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita
> > <fujita.ets...@lab.ntt.co.jp> wrote:
> >>> The best plan is presumably something like this as you said before:
> >>>
> >>> LockRows
> >>> -> Nested Loop
> >>>      -> Seq Scan on verysmall v
> >>>      -> Foreign Scan on bigft1 and bigft2
> >>>           Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
> >>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2
> >>>
> >>> Consider the EvalPlanQual testing to see if the updated version of a
> >>> tuple in v satisfies the query.  If we use the column in the testing,
> >>> we
> >>> would get the wrong results in some cases.
> 
> >> More precisely, we would get the wrong result when the value of v.q or
> >> v.r
> >> in the updated version has changed.
> 
> > Interesting test case.  It's worth considering why this works if you
> > were to replace the Foreign Scan with an Index Scan; suppose the query
> > is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x =
> > t.x FOR UPDATE OF v, so that you get:
> >
> > LockRows
> > -> Nested Loop
> >    -> Seq Scan on verysmall v
> >    -> Foreign Scan on realbiglocaltable t
> >      Index Cond: v.x = t.x
> >
> > In your example, the remote SQL pushes down certain quals to the
> > remote server, and so if we just return the same tuple they might no
> > longer be satisfied.  In this example, the index qual is essentially a
> > filter condition that has been "pushed down" into the index AM.  The
> > EvalPlanQual machinery prevents this from generating wrong answers by
> > rechecking the index cond - see IndexRecheck.  Even though it's
> > normally the AM's job to enforce the index cond, and the executor does
> > not need to recheck, in the EvalPlanQual case it does need to recheck.
> >
> > I think the foreign data wrapper case should be handled the same way.
> > Any condition that we initially pushed down to the foreign server
> > needs to be locally rechecked if we're inside EPQ.
> 
> Agreed.
> 
> As KaiGai-san also pointed out before, I think we should address this
> in each of the following cases:
> 
> 1) remote qual (scanrelid>0)
> 2) remote join (scanrelid==0)
> 
> As for #1, I noticed that there is a bug in handling the same kind of
> FDW queries, which will be shown below.  As you said, I think this
> should be addressed by rechecking the remote quals *locally*.  (I
> thought another fix for this kind of bug before, though.)  IIUC, I
> think this should be fixed separately from #2, as this is a bug not
> only in 9.5, but in back branches.  Please find attached a patch.
> 
> Create an environment:
> 
> mydatabase=# create table t1 (a int primary key, b text);
> mydatabase=# insert into t1 select a, 'notsolongtext' from
> generate_series(1, 1000000) a;
> 
> postgres=# create server myserver foreign data wrapper postgres_fdw
> options (dbname 'mydatabase');
> postgres=# create user mapping for current_user server myserver;
> postgres=# create foreign table ft1 (a int, b text) server myserver
> options (table_name 't1');
> postgres=# alter foreign table ft1 options (add use_remote_estimate
> 'true');
> postgres=# create table inttab (a int);
> postgres=# insert into inttab select a from generate_series(1, 10) a;
> postgres=# analyze ft1;
> postgres=# analyze inttab;
> 
> Run concurrent transactions that produce incorrect result:
> 
> [Terminal1]
> postgres=# begin;
> BEGIN
> postgres=# update inttab set a = a + 1 where a = 1;
> UPDATE 1
> 
> [Terminal2]
> postgres=# explain verbose select * from inttab, ft1 where inttab.a =
> ft1.a limit 1 for update;
>                                            QUERY PLAN
> -------------------------------------------------------------------------------------------------
>  Limit  (cost=100.43..198.99 rows=1 width=70)
>    Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
>    ->  LockRows  (cost=100.43..1086.00 rows=10 width=70)
>          Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
>          ->  Nested Loop  (cost=100.43..1085.90 rows=10 width=70)
>                Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
>                ->  Seq Scan on public.inttab (cost=0.00..1.10 rows=10
>                ->  width=10)
>                      Output: inttab.a, inttab.ctid
>                ->  Foreign Scan on public.ft1 (cost=100.43..108.47 rows=1
>                ->  width=18)
>                      Output: ft1.a, ft1.b, ft1.*
>                      Remote SQL: SELECT a, b FROM public.t1 WHERE
>                      (($1::integer = a)) FOR UPDATE
> (11 rows)
> 
> postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1
> for update;
> 
> [Terminal1]
> postgres=# commit;
> COMMIT
> 
> [Terminal2]
> (After the commit in Terminal1, the following result will be shown in
> Terminal2.  Note that the values of inttab.a and ft1.a wouldn't
> satisfy the remote qual!)
>  a | a |       b
> ---+---+---------------
>  2 | 1 | notsolongtext
> (1 row)
> 
> As for #2, I didn't come up with any solution to locally rechecking
> pushed-down join conditions against a joined tuple populated from a
> column that we discussed.  Instead, I'd like to revise a
> local-join-execution-plan-based approach that we discussed before, by
> addressing your comments such as [1].  Would it be the right way to
> go?
> 
> Best regards,
> Etsuro Fujita
> 
> [1]
> http://www.postgresql.org/message-id/ca+tgmoaazs0dr23r7ptbseqfwotuvcpnbqdhxebo9gi+dmx...@mail.gmail.com

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

Reply via email to