Re: performance of analytical query

2021-11-23 Thread Justin Pryzby
On Fri, Nov 12, 2021 at 09:12:38PM +0100, Jiří Fejfar wrote:
> * I know that PG is focused on OLTP rather then analytics, but we are happy
> with it at all and do not wish to use another engine for analytical
> queries... isn't somewhere some "PG analytical best practice" available?

It's a good question.  Here's some ideas:

I don't think we know what version you're using - that's important, and there's
other ideas here:
https://wiki.postgresql.org/wiki/Slow_Query_Questions

You said that your query was slow "probably after VACUUM ANALYZE".
Is it really faster without stats ?  You can do this to see if there was really
a better plan "before":
| begin; DELETE FROM pg_statistic WHERE starelid='thetable'::regclass; explain 
analyze ...; rollback;

Try enable_nestloop=off for analytic queries;

Test whether jit=off helps you or hurts you (you said that it's already 
disabled);

You can do other things that can improve estimates, by sacrificing planning time
(which for an analytic query is a small component of the total query time, and
pays off at runtime if you can get a btter plan):
 - FKs can help with estimates since pg9.6;
 - CREATE STATISTICS;
 - ALTER SET STATISTICS or increase default_statistics_target;
 - increase from_collapse_limit and join_collapse_limit.  But I don't think it
   will help your current query plan.
 - partitioning data increases planning time, and (if done well) can allow
   improved execution plans;

You can REINDEX or maybe CLUSTER during "off hours" to optimize indexes/tables.

BRIN indexes (WITH autoanalyze) are very successful for us, here.

You can monitor your slow queries using auto_explain and/or pg_stat_statements.

You can reduce autovacuum_analyze_threshold to analyze more often.

I'd be interested to hear if others have more suggestions.

-- 
Justin




Re: performance of analytical query

2021-11-12 Thread Jiří Fejfar
On Fri, 12 Nov 2021 at 03:41, Justin Pryzby  wrote:

> On Thu, Nov 11, 2021 at 08:20:57PM +0100, Jiří Fejfar wrote:
> > Hi folks,
> >
> > we have found that (probably after VACUUM ANALYZE) one analytical query
> > starts to be slow on our production DB. Moreover, more or less the same
> > plan is used on our testing data (how to restore our testing data is
> > described at the end of this email), or better to say the same problem
> > exists in both (production vs testing data) scenarios: nested loop
> scanning
> > CTE several thousand times is used due to the bad estimates:
> > https://explain.dalibo.com/plan/sER#plan/node/87 (query is included on
> > dalibo).
>
> > Do you have any idea how to get HASH JOINS in the CTE w_1p_data instead
> of
> > NESTED LOOPs?
> > * Add some statistics to not get bad estimates on "lower-level" CTEs?
>
> Do you know why the estimates are bad ?
>
> I have no clear insight at the moment... problem is probably with bad
estimates which chain along the whole tree of nodes... one bad estimate was
after aggregation for example... probably, I would need to explore
carefully whole execution plan and identify sources of unprecise estimates
and correct it with additional, more precise statistics when possible,
right?


> Index Scan using t_map_plot_cell__cell_gid__idx on cm_plot2cell_mapping
> cm_plot2cell_mapping (cost=0.29..18.59 rows=381 width=12) (actual
> time=0.015..2.373 rows=3,898 loops=1)
> Index Cond: (cm_plot2cell_mapping.estimation_cell =
> f_a_cell.estimation_cell)
> Buffers: shared hit=110
>
> I don't know, but is the estimate for this portion of the plan improved by
> doing:
> | ALTER TABLE f_a_cell ALTER estimation_cell SET STATISTICS 500; ANALYZE
> f_a_cell;
>
> this does not help to the plan as a whole... but I am thinking about
increasing this parameter (size of sample) at the DB level


> > * In a slightly more complicated function I used temporary tables to be
> > able to narrow statistics [2] but I am afraid of system table bloating
> > because of the huge amount of usage of this function on the production
> > (hundred thousand of calls by day when data are to be analyzed).
>
> I would try this for sure - I think hundreds of calls per day would be no
> problem.  If you're concerned, you could add manual calls to do (for
> example)
> VACUUM pg_attribute; after dropping the temp tables.
>
> it is hundreds of thousands of calls (10^5) ... but yes I got some hints
how to avoid bloating (basically use temp tables longer and truncate them
instead of deleting when possible)


> BTW, we disable nested loops for the our analytic report queries.  I have
> never
> been able to avoid pathological plans any other way.
>

I will think about that.

AND

we further simplified the query and get again one good execution plan
https://explain.dalibo.com/plan/tCk :-)

I have some thoughts now:

* I know that PG is focused on OLTP rather then analytics, but we are happy
with it at all and do not wish to use another engine for analytical
queries... isn't somewhere some "PG analytical best practice" available?
* It seems that the the form / style of query has great impact on execution
plans... I was very happy with writing queries as CTEs on top of other CTEs
or layering VIEWS because you can really focus on the semantics of the
problem and I hoped that planner will somehow magically "compile" my code
and get something good enough with respect to performance. Of course, I
have to not use materialized CTEs, but it was not possible with NOT
MATERIALIZED version as performance was bad and I was not able even to get
oriented in exec. plan...

Thank you for your ideas! J.


Re: performance of analytical query

2021-11-12 Thread Justin Pryzby
On Fri, Nov 12, 2021 at 10:55:53AM -0700, Michael Lewis wrote:
> On Thu, Nov 11, 2021 at 7:42 PM Justin Pryzby  wrote:
> 
> > BTW, we disable nested loops for the our analytic report queries.  I have
> > never
> > been able to avoid pathological plans any other way.
> 
> Curious, do you see any problems from that? Are there certain nodes that
> really are best suited to a nested loop like a lateral subquery?

When I first disabled it years ago, I did it for the entire database, and it
caused issues with a more interactive, non-analytic query, on a non-partitioned
table.

So my second attempt was to disable nested loops only during report queries,
and I have not looked back.  For our report queries on partitioned tables, the
overhead of hashing a handful of rows is of no significance.  Any query that
finishes in 1sec would be exceptionally fast.

BTW, Jiří's inquiry caused me to look at the source of one of our historic
mis-estimates, and to realize that it's resolved in pg14:
https://www.postgresql.org/message-id/2022173102.GI17618%40telsasoft.com

I doubt that's enough to avoid catastrophic nested loop plans in every case
(especially CTEs on top of CTEs).

There was a discussion about discouraging nested loop plans that weren't
provably "safe" (due to returning at most one row, due to a unique index).
https://www.postgresql.org/message-id/CA%2BTgmoYtWXNpj6D92XxUfjT_YFmi2dWq1XXM9EY-CRcr2qmqbg%40mail.gmail.com

-- 
Justin




Re: performance of analytical query

2021-11-12 Thread Michael Lewis
On Thu, Nov 11, 2021 at 7:42 PM Justin Pryzby  wrote:

> BTW, we disable nested loops for the our analytic report queries.  I have
> never
> been able to avoid pathological plans any other way.
>

Curious, do you see any problems from that? Are there certain nodes that
really are best suited to a nested loop like a lateral subquery?


Re: performance of analytical query

2021-11-11 Thread Justin Pryzby
On Thu, Nov 11, 2021 at 08:20:57PM +0100, Jiří Fejfar wrote:
> Hi folks,
> 
> we have found that (probably after VACUUM ANALYZE) one analytical query
> starts to be slow on our production DB. Moreover, more or less the same
> plan is used on our testing data (how to restore our testing data is
> described at the end of this email), or better to say the same problem
> exists in both (production vs testing data) scenarios: nested loop scanning
> CTE several thousand times is used due to the bad estimates:
> https://explain.dalibo.com/plan/sER#plan/node/87 (query is included on
> dalibo).

> Do you have any idea how to get HASH JOINS in the CTE w_1p_data instead of
> NESTED LOOPs?
> * Add some statistics to not get bad estimates on "lower-level" CTEs?

Do you know why the estimates are bad ?

Index Scan using t_map_plot_cell__cell_gid__idx on cm_plot2cell_mapping 
cm_plot2cell_mapping (cost=0.29..18.59 rows=381 width=12) (actual 
time=0.015..2.373 rows=3,898 loops=1)
Index Cond: (cm_plot2cell_mapping.estimation_cell = 
f_a_cell.estimation_cell)
Buffers: shared hit=110

I don't know, but is the estimate for this portion of the plan improved by 
doing:
| ALTER TABLE f_a_cell ALTER estimation_cell SET STATISTICS 500; ANALYZE 
f_a_cell;

> * In a slightly more complicated function I used temporary tables to be
> able to narrow statistics [2] but I am afraid of system table bloating
> because of the huge amount of usage of this function on the production
> (hundred thousand of calls by day when data are to be analyzed).

I would try this for sure - I think hundreds of calls per day would be no
problem.  If you're concerned, you could add manual calls to do (for example)
VACUUM pg_attribute; after dropping the temp tables.

BTW, we disable nested loops for the our analytic report queries.  I have never
been able to avoid pathological plans any other way.




performance of analytical query

2021-11-11 Thread Jiří Fejfar
Hi folks,

we have found that (probably after VACUUM ANALYZE) one analytical query
starts to be slow on our production DB. Moreover, more or less the same
plan is used on our testing data (how to restore our testing data is
described at the end of this email), or better to say the same problem
exists in both (production vs testing data) scenarios: nested loop scanning
CTE several thousand times is used due to the bad estimates:
https://explain.dalibo.com/plan/sER#plan/node/87 (query is included on
dalibo).

We improved the query guided by some intuitive thoughts about how it works
and get a much faster (120x) plan
https://explain.dalibo.com/plan/M21#plan/node/68. We continued with further
improvement/simplification of the query but we get again a similar plan
https://explain.dalibo.com/plan/nLb#plan/node/72 with nested loop and with
original inferior performance. I realized that the success of the
intermediate plan (M21) is somewhat random as is based on bad estimates as
well.

Further, I tried version forcing to not materialize CTE
https://explain.dalibo.com/plan/0Tp#plan and version using PG default CTE
materialization policy https://explain.dalibo.com/plan/g7M#plan/node/68.
Both with no success.

Do you have any idea how to get HASH JOINS in the CTE w_1p_data instead of
NESTED LOOPs?
* Add some statistics to not get bad estimates on "lower-level" CTEs?
* Some PG configuration (I am currently only disabling JIT [1])?
* Rewrite that query into several smaller pieces and use PL/pgSQL to put it
together?
* In a slightly more complicated function I used temporary tables to be
able to narrow statistics [2] but I am afraid of system table bloating
because of the huge amount of usage of this function on the production
(hundred thousand of calls by day when data are to be analyzed).

---
how to restore data
===
ERD of the schema is also available [3].

testing data as a part of an extension
---
It is possible to install [4] the extension
https://gitlab.com/nfiesta/nfiesta_pg and run regression tests [5] (make
installcheck-all). This will create database contrib_regression_fst_1p
(besides other DBs) and populate this DB with the testing data. The
regression test fst_1p_data is in fact testing functionality/code, which I
am experimenting with.

using DB dump (without extension)
--
It is also possible to create mentioned testing DB by simply downloading DB
dumps from the link
https://drive.google.com/drive/folders/1OVJEISpfuvbxPQG1ArDmSQxZByNZN0xG?usp=sharing
followed by creating DB with postgis extension and restoring dumps:
* perf_test.sql (format plain) to be used with psql \i
* perf_test.dump to be used with pg_restore...

Thank you for possible suggestions, Jiří.

[1] https://gitlab.com/nfiesta/nfiesta_pg/-/blob/master/.gitlab-ci.yml#L10
[2]
https://gitlab.com/nfiesta/nfiesta_pg/-/blob/master/functions/extschema/fn_2p_data.sql#L79
[3] https://gitlab.com/nfiesta/nfiesta_pg/-/wikis/Data-Storage#v25x.
[4] https://gitlab.com/nfiesta/nfiesta_pg/-/wikis/Installation
[5] https://gitlab.com/nfiesta/nfiesta_pg/-/jobs/1762550188