Re: [HACKERS] Question about optimising (Postgres_)FDW

Etsuro Fujita Thu, 17 Apr 2014 01:32:30 -0700

(2014/04/16 22:16), Hannu Krosing wrote:

On 04/16/2014 01:35 PM, Etsuro Fujita wrote:

Maybe I'm missing something, but I think that you can do what I think
you'd like to do by the following procedure:

No, what I'd like PostgreSQL to do is to

1. select the id+set from local table
2. select the rows from remote table with WHERE ID IN (<set selected in
step 1>)
3. then join the original set to selected set, with any suitable join
strategy

The things I do not want are

A. selecting all rows from remote table
     (this is what your examples below do)

or

B. selecting rows from remote table by single selects using "ID = $"
     (this is something that I managed to do by some tweaking of costs)

as A will be always slow if there are millions of rows in remote table
and B is slow(ish) when the idset is over a few hundred ids

I hope this is a bit better explanation than I provided before .


Ah, I understand what you'd like to do.  Thank you for the explanation.

P.S. I am not sure if this is a limitation of postgres_fdw or postgres
itself

If I understand correctly, neither the current postgres_fdw planningfunction nor the current postgres planner itself support such a plan.For that I think we would probably need to implement a distributed queryprocessing technique such as semijoin or bloomjoin in those modules.


Thanks,

P.S.

or, that as Tom mentioned, by disabling the use_remote_estimate function:


I misunderstood the meaning of what Tom pointed out.  Sorry for that.

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about optimising (Postgres_)FDW

Reply via email to