On 12/1/20 6:17 PM, Ashutosh Bapat wrote:
On Mon, Nov 30, 2020 at 11:56 PM Andrey Lepikhov
<a.lepik...@postgrespro.ru> wrote:

On 30.11.2020 22:38, Tom Lane wrote:
Andrey Lepikhov <a.lepik...@postgrespro.ru> writes:
If you're unhappy with the planning results you get for this,
why don't you have use_remote_estimate turned on?

I have a mixed load model. Large queries are suitable for additional
estimate queries. But for many simple SELECT's that touch a small
portion of the data, the latency has increased significantly. And I
don't know how to switch the use_remote_estimate setting in such case.

You may disable use_remote_estimates for given table or a server. So
if tables participating in short queries are different from those in
the large queries, you could set use_remote_estimate at table level to
turn it off for the first set. Otherwise, we need a FDW level GUC
which can be turned on/off for a given session or a query.

Currently I implemented another technique:
- By default, use_remote_estimate is off.
- On the estimate_path_cost_size() some estimation criteria is checked. If true, we force remote estimation for this JOIN. This approach solves the push-down problem in my case - TPC-H test with 6 servers/instances. But it is not so scalable, as i want.

Generally use_remote_estimate isn't scalable and there have been
discussions about eliminating the need of it. But no concrete proposal
has come yet.

Above I suggested to use results of cost calculation on local JOIN, assuming that in the case of postgres_fdw wrapper very likely, that foreign server will use the same type of join (or even better, if it has some index, for example).
If this approach is of interest, I can investigate it.

--
regards,
Andrey Lepikhov
Postgres Professional


Reply via email to