Hi Ingolf,

On 2. Feb 2021, at 13:05, Markhof, Ingolf 
<ingolf.mark...@de.verizon.com<mailto:ingolf.mark...@de.verizon.com>> wrote:

Hi!

My PostgreSQL version is 11.8.

The query I am running is referring to a number of foreign tables. The first 
one (table1) has to IDs, let's say ID_A and ID_B. While ID_A is unique, ID_B is 
not. In my case, I am pulling formation for a value of IB_B for which about 800 
rows (with unique ID_A) exist. I found:

While

select * from my_view where id_b='some value';

seemingly runs "forever" (I cancelled execution after a few hours), the 
following completes in about 1 hr:

select * from my_view where ia_a in (
       select id_a from table1 where id_b='some value'
    );

So, I tried smaller chunks of ID_a and found the execution time is non-linear 
with respect to number of IDs. For e.g. 50 ID_A's, it was completed in about 12 
sec.

[...]

I then found the option fetch_size, e.g. ALTER SERVER some_server OPTIONS 
(fetch_size '50000'). A chunk of 50 now executes in 2 seconds (instead of 12 
before).

Thanks for the additional info. I tried to replicate this, you can find the 
GitHub Gist at [1], happy to hear your feedback about it.

What I can see from the execution plans in my example is, that the postgres_fdw 
pushes down that part

    Remote SQL: SELECT id_a FROM public.a WHERE ((id_b = 1))

part to the remote. On the remote DB this query can result in either an 
index-only scan, an index scan or a full table scan. Which method is chosen 
depends on table size and indexes. Given the nature of postgres_fdw this will 
be done in any case with a sequential query. For this part I would claim:

- If the planner expects few rows, it will choose an index-lookup which is a 
good thing because it effectively reduces the amount of data that needs to be 
queried. This would make it fast, given this is a sequential scan.
- If the planner expects many rows, it might choose a sequential scan which can 
be slow depending on the overall size of the table and likely whether it is 
cached or not.

So, I found the "size" of the query has a serious impact to the execution time. 
I don't really understand why execution 16*50 takes 16*2 secs only, but 
executing 1*800 takes about 3000 seconds...

The mentioned fetch_size parameter has a positive effect, because one can grab 
many more rows and return them at the same time. Worst case (and this is just 
pure assumption), on each new fetch, the query might be re-executed and thus 
runtime becomes much more.

Further up in the plan, I see

    Remote SQL: SELECT id_a, id_b FROM public.a

which is the "SELECT * FROM my_view" part. Meaning, here it will definitely do 
a full table scan on remote since it cannot push down the IN condition. I don't 
really see right now why this query at all is slower than your original form. 
In my experiment it is not, but maybe I am doing something wrong in the schema.

One thought would be however, that the full table scan on the remote is more 
efficient than pushing down the filter and thus it returns faster.

To really figure out more, I would suggest to increase the logging level on 
your remote server in order to see which queries are really executed. Even 
better to maybe use auto_explain to fetch plans and see whether these claims 
apply.

Best,
Sebastian

--


[1]: https://gist.github.com/sdressler/9a93d66b7052dc75ec45c0a4bf5c61de

Sebastian Dressler, Solution Architect, Swarm64
+49 30 994 0496 72 | sebast...@swarm64.com<mailto:sebast...@swarm64.com>

Reply via email to