Hello hackers, At present, in postgres_fdw, if a query which is using a parallel plan is fired from the remote end fails to use the parallel plan locally because of the presence of CURSORS. Consider the following example, Local server, Table: create table t ( i int, j int, k text); insert into t values(generate_series(1,10000), generate_series(1, 10000), 'check_this_out'); Query select * from t where i > 1000; Query plan Gather (cost=0.00..116.08 rows=9000 width=23) Workers Planned: 2 -> Parallel Seq Scan on t (cost=0.00..116.08 rows=3750 width=23) Filter: (i > 1000)
Foreign server create extension postgres_fdw; CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '5432', dbname 'postgres'); CREATE USER MAPPING FOR user1 SERVER foreign_server OPTIONS (user 'user1'); Table CREATE FOREIGN TABLE foreign_table ( i int, j int, k text ) SERVER foreign_server OPTIONS (schema_name 'public', table_name 't'); Query select * from t where i > 1000; Query plan at the local server Seq Scan on t (cost=0.00..189.00 rows=9000 width=23) Filter: (i > 1000) I have used auto_explain extension to get the query plans at the local server and also following settings in .conf to force the parallel plans for the purpose of demonstration -- min_parallel_table_scan_size = 0 parallel_tuple_cost= 0 parallel_setup_cost = 0 with the patch: set postgres_fdw.use_cursor = false; Query plan at the local server Gather (cost=0.00..116.08 rows=9000 width=23) Workers Planned: 2 -> Parallel Seq Scan on t (cost=0.00..116.08 rows=3750 width=23) Filter: (i > 1000) Now, to overcome this limitation, I have worked on this idea (suggested by my colleague Bernd Helmle) of bypassing the cursors. The way it works is as follows, there is a new GUC introduced postgres_fdw.use_cursor, which when unset uses the mode without the cursor. Now, it uses PQsetChunkedRowsMode in create_cursor when non-cursor mode is used. The size of the chunk is the same as the fetch_size. Now in fetch_more_data, when non-cursor mode is used, pgfdw_get_next_result is used to get the chunk in PGresult and processed in the same manner as before. Now, the issue comes when there are simultaneous queries, which is the case with the join queries where all the tables involved in the join are at the local server. Because in that case we have multiple cursors opened at the same time and without a cursor mechanism we do not have any information or any other structure to know what to fetch from which query. To handle that case, we have a flag only_query, which is unset as soon as we have assigned the cursor_number >= 2, in postgresBeginForeignScan. Now, in fetch_more data, when we find out that only_query is unset, then we fetch all the data for the query and store it in a Tuplestore. These tuples are then transferred to the fsstate->tuples and then processed as usual. So yes there is a performance drawback in the case of simultaneous queries, however, the ability to use parallel plans is really an added advantage for the users. Plus, we can keep things as before by this new GUC -- use_cursor, in case we are losing more for some workloads. So, in short I feel hopeful that this could be a good idea and a good time to improve postgres_fdw. Looking forward to your reviews, comments, etc. -- Regards, Rafia Sabih CYBERTEC PostgreSQL International GmbH
0001-Add-a-fetch-mechanism-without-cursors.patch
Description: Binary data