On Thu, Feb 4, 2021 at 7:21 PM Etsuro Fujita <[email protected]> wrote: > On Mon, Feb 1, 2021 at 12:06 PM Etsuro Fujita <[email protected]> wrote: > > Rather than doing so, I'd like to propose to allow > > FDWs to disable async execution of them in problematic cases by > > themselves during executor startup in the first cut. What I have in > > mind for that is: > > > > 1) For an FDW that has async-capable ForeignScan(s), we allow the FDW > > to record, for each of the async-capable and non-async-capable > > ForeignScan(s), the information on a connection to be used for the > > ForeignScan into EState during BeginForeignScan(). > > > > 2) After doing ExecProcNode() to each SubPlan and the main query tree > > in InitPlan(), we give the FDW a chance to a) reconsider, for each of > > the async-capable ForeignScan(s), whether the ForeignScan can be > > executed asynchronously as planned, based on the information stored > > into EState in #1, and then b) disable async execution of the > > ForeignScan if not. > > s/ExecProcNode()/ExecInitNode()/. Sorry for that. I’ll post an > updated patch for this in a few days.
I created a WIP patch for this. For #2, I added a new callback
routine ReconsiderAsyncForeignScan(). The routine for postgres_fdw
postgresReconsiderAsyncForeignScan() is pretty simple: async execution
of an async-capable ForeignScan is disabled if the connection used for
it is used in other parts of the query plan tree except async subplans
just below the parent Append. Here is a running example:
postgres=# create table t1 (a int, b int, c text);
postgres=# create table t2 (a int, b int, c text);
postgres=# create foreign table p1 (a int, b int, c text) server
server1 options (table_name 't1');
postgres=# create foreign table p2 (a int, b int, c text) server
server2 options (table_name 't2');
postgres=# create table pt (a int, b int, c text) partition by range (a);
postgres=# alter table pt attach partition p1 for values from (10) to (20);
postgres=# alter table pt attach partition p2 for values from (20) to (30);
postgres=# insert into p1 select 10 + i % 10, i, to_char(i, 'FM0000')
from generate_series(0, 99) i;
postgres=# insert into p2 select 20 + i % 10, i, to_char(i, 'FM0000')
from generate_series(0, 99) i;
postgres=# analyze pt;
postgres=# create table loct (a int, b int);
postgres=# create foreign table ft (a int, b int) server server1
options (table_name 'loct');
postgres=# insert into ft select i, i from generate_series(0, 99) i;
postgres=# analyze ft;
postgres=# create view v as select * from ft;
postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=200.00..306.84 rows=2 width=21)
Output: pt.a, pt.b, pt.c, ft.a, ft.b
-> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8)
Output: ft.a, ft.b
Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99))
-> Append (cost=100.00..204.55 rows=2 width=13)
-> Foreign Scan on public.p1 pt_1 (cost=100.00..102.27
rows=1 width=13)
Output: pt_1.a, pt_1.b, pt_1.c
Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99))
-> Async Foreign Scan on public.p2 pt_2
(cost=100.00..102.27 rows=1 width=13)
Output: pt_2.a, pt_2.b, pt_2.c
Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99))
(12 rows)
For this query, while p2 is executed asynchronously, p1 isn’t as it
uses the same connection with ft. BUT:
postgres=# create role view_owner SUPERUSER;
postgres=# create user mapping for view_owner server server1;
postgres=# alter view v owner to view_owner;
postgres=# explain verbose select * from pt, v where pt.b = v.b and v.b = 99;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=200.00..306.84 rows=2 width=21)
Output: pt.a, pt.b, pt.c, ft.a, ft.b
-> Foreign Scan on public.ft (cost=100.00..102.27 rows=1 width=8)
Output: ft.a, ft.b
Remote SQL: SELECT a, b FROM public.loct WHERE ((b = 99))
-> Append (cost=100.00..204.55 rows=2 width=13)
-> Async Foreign Scan on public.p1 pt_1
(cost=100.00..102.27 rows=1 width=13)
Output: pt_1.a, pt_1.b, pt_1.c
Remote SQL: SELECT a, b, c FROM public.t1 WHERE ((b = 99))
-> Async Foreign Scan on public.p2 pt_2
(cost=100.00..102.27 rows=1 width=13)
Output: pt_2.a, pt_2.b, pt_2.c
Remote SQL: SELECT a, b, c FROM public.t2 WHERE ((b = 99))
(12 rows)
in this setup, p1 is executed asynchronously as ft doesn’t use the
same connection with p1.
I added to postgresReconsiderAsyncForeignScan() this as well: even if
the connection isn’t used in the other parts, async execution of an
async-capable ForeignScan is disabled if the subplans of the Append
are all async-capable, and they use the same connection, because in
that case the subplans won’t be parallelized at all, and the overhead
of async execution may cause a performance degradation.
Attached is an updated version of the patch. Sorry for the delay.
Best regards,
Etsuro Fujita
async-wip-2021-02-10.patch
Description: Binary data
