Re: pg_stat_advisor extension
1. In the case of parallel workers the plan_rows value has a different semantics than the number of rows predicted. Just explore get_parallel_divisor(). 2. The extension recommends new statistics immediately upon an error finding. But what if the reason for the error is stale statistics? Or this error may be raised for only one specific set of constants, and estimation will be done well in another 99.% of cases for the same expression. The new parameter, `pg_stat_advisor.analyze_scale_factor`, can suggest the execution of the ANALYZE command on specific tables. The extension now evaluates the ratio of `n_live_tup` (number of live tuples) to `n_mod_since_analyze` (number of modifications since last analyze) in the `pg_stat_all_tables` catalog. If this ratio exceeds the value specified in `analyze_scale_factor`, the extension will suggest an update to the table's statistics. There are a lot of parameters that influences on estimated rows. Statistics might not help improve estimated rows. This feature is designed to provide users with data-driven insights to decide whether updating statistics via the ANALYZE command could potentially improve query performance. By suggesting rather than automatically executing statistics updates, we empower you to make informed decisions based on the specific needs and conditions of your database environment. I've developed an extension that provides suggestions on whether to update or create statistics for your PostgreSQL database, without executing any changes. This approach allows you to consider various parameters that influence row estimates and make informed decisions about optimizing your database's performance. Your feedback is invaluable, and we look forward to hearing about your experiences and any improvements you might suggest. Best regards, Ilia Evdokimov Tantor Labs LLC. From eb998bea96a3640d240afa63e08cc8cf98925bf7 Mon Sep 17 00:00:00 2001 From: Ilia Evdokimov Date: Mon, 13 May 2024 14:41:59 +0300 Subject: [PATCH] 'pg_stat_advisor' extension This service as a hook into executor. It has two GUC-parameters. pg_stat_advisor.analyze_scale_factor: if ratio of pg_stat_all_tables.n_live_tup to pg_stat_all_tables.n_mod_since_analyze is greater than pg_stat_advisor.analyze_scale_factor extension prints suggestion executing ANALYZE command. pg_stat_advisor.suggest_statistics_threshold: the ratio of total rows to planned rows is greater than or equal to this threshold the extension prints suggestion executing the creation of statistics, using the naming format 'relationName_columns' --- contrib/Makefile | 1 + contrib/pg_stat_advisor/Makefile | 20 + .../expected/pg_stat_advisor.out | 52 ++ contrib/pg_stat_advisor/meson.build | 30 + contrib/pg_stat_advisor/pg_stat_advisor.c | 560 ++ .../pg_stat_advisor/sql/pg_stat_advisor.sql | 24 + 6 files changed, 687 insertions(+) create mode 100644 contrib/pg_stat_advisor/Makefile create mode 100644 contrib/pg_stat_advisor/expected/pg_stat_advisor.out create mode 100644 contrib/pg_stat_advisor/meson.build create mode 100644 contrib/pg_stat_advisor/pg_stat_advisor.c create mode 100644 contrib/pg_stat_advisor/sql/pg_stat_advisor.sql diff --git a/contrib/Makefile b/contrib/Makefile index abd780f277..d6ce2fe562 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -33,6 +33,7 @@ SUBDIRS = \ pg_buffercache \ pg_freespacemap \ pg_prewarm \ + pg_stat_advisor \ pg_stat_statements \ pg_surgery \ pg_trgm \ diff --git a/contrib/pg_stat_advisor/Makefile b/contrib/pg_stat_advisor/Makefile new file mode 100644 index 00..f31b939e8a --- /dev/null +++ b/contrib/pg_stat_advisor/Makefile @@ -0,0 +1,20 @@ +# contrib/pg_stat_advisor/Makefile + +MODULE_big = pg_stat_advisor +OBJS = \ + $(WIN32RES) \ + pg_stat_advisor.o +PGFILEDESC = "pg_stat_advisor - analyze query performance and recommend the creation of additional statistics" + +REGRESS = pg_stat_advisor + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/pg_stat_advisor +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_stat_advisor/expected/pg_stat_advisor.out b/contrib/pg_stat_advisor/expected/pg_stat_advisor.out new file mode 100644 index 00..8f3dab2c2f --- /dev/null +++ b/contrib/pg_stat_advisor/expected/pg_stat_advisor.out @@ -0,0 +1,52 @@ +LOAD 'pg_stat_advisor'; +SET pg_stat_advisor.analyze_scale_factor = 0.4; +SET pg_stat_advisor.suggest_statistics_threshold = 0.11; +CREATE TABLE my_tbl(fld_1 INTEGER, fld_2 BIGINT) WITH (autovacuum_enabled = false); +INSERT INTO my_tbl (fld_1, fld_2) +SELECT + i/100 as fld_1, + i/500 as fld_2 +FROM generate_series(1, 1000) s(i); +ANALYZE my_tbl; +INSERT INTO my_tbl (fld_1, fld_2) +SELECT + i/10
Re: pg_stat_advisor extension
On Feb 08 2024 at 07:14:18, Andrei Lepikhov wrote: 1. In the case of parallel workers the plan_rows value has a different semantics than the number of rows predicted. Just explore get_parallel_divisor(). Yes, this is a very weighty and important issue. I need to think about this very carefully. 2. The extension recommends new statistics immediately upon an error finding. But what if the reason for the error is stale statistics? Or this error may be raised for only one specific set of constants, and estimation will be done well in another 99.% of cases for the same expression. According to No.2, it might make sense to collect and track clause combinations and cardinality errors found and let the DBA make decisions on their own. Your proposal is very interesting. In my opinion, it is worth considering updating the extended statistics if they are truly stale. And write about this in a separate message with suggestion updating statistics. If I succeed, then in the next patch I will add the kind of extended statistics to the message, deal with the parallel workers and update statistics if necessary. If you have additional suggestions and thoughts, feel free to write them in this thread. Regards, Ilia Evdokimov, Tantor Labs LLC.
Re: pg_stat_advisor extension
Our further discussion of this new extension takes place in this thread: https://www.postgresql.org/message-id/flat/f822b674-9697-43b9-931b-4d69729a26ff%40tantorlabs.com .Due to technical difficulties in the current thread, I will not be able to conduct a dialogue except in HTML format. And this will make it inconvenient for everyone to read the messages. I apologize for the inconvenience caused. Regards, Ilia Evdokimov, TantorLabs LLC.
Re: pg_stat_advisor extension
On Feb 8 2024 at 00:00:00 jian he >INT MAX >should be 1.0? I don’t know why Konstantin Knizhnik used the ratio of actual tuples to the planned ones, but most who start testing my extension expect that it will be a coefficient from 0 to 1, which will be the ratio of the estimated tuples to the actual ones. Therefore, I changed the value of this coefficient the other way around and now the value can be from 0 to 1. The patch with changes has been attached. > now CREATE STATISTICS, the statistics name is optional I constructed the name of the statistics so that the user could copy the line with 'CREATE STATISTICS' with the mouse and execute this command faster. But if the user wants ITS name, he can do it manually. > here you can explicitly mention the statistics kind would be great I agree with you. That would be my next step. That's why I'm doing it now. > Also since the documentation is limited, more comments explainingSuggestMultiColumnStatisticsForNode would be great. overall the comments are very little, it should be more (that's my opinion). Yes, certainly. I'll do it in the next patch. I'm looking forward to your thoughts and feedback. Regards, Ilia Evdokimov, Tantor Labs LLC. From f87f4a57e532d57f43dab4764d08ddf83d9f3d8f Mon Sep 17 00:00:00 2001 From: Ilia Evdokimov Date: Thu, 8 Feb 2024 16:00:57 +0300 Subject: [PATCH] 'pg_stat_advisor' extension. This serves as a hook into the executor for determining total rows and planned rows. The process starts by checking the `pg_stat_advisor.suggest_statistics_threshold` GUC parameter. If it's set to 0.0, the extension does not proceed further. When the parameter is greater than 0.0, the extension evaluates the accuracy of planned rows. A suggestion for creating statistics is made if the ratio of total rows to planned rows is greater than or equal to this threshold. Only then does it extract the relation and columns from the query. The extension checks pg_statistic_ext for existing relevant statistics. If no statistics are found, it prints a notice suggesting the creation of statistics, using the naming format 'relationName_columns'. Author: Ilia Evdokimov --- contrib/Makefile | 1 + contrib/meson.build | 1 + contrib/pg_stat_advisor/.gitignore| 3 + contrib/pg_stat_advisor/Makefile | 20 + contrib/pg_stat_advisor/README.md | 85 +++ .../expected/pg_stat_advisor.out | 96 contrib/pg_stat_advisor/meson.build | 30 ++ contrib/pg_stat_advisor/pg_stat_advisor.c | 482 ++ .../pg_stat_advisor/sql/pg_stat_advisor.sql | 50 ++ 9 files changed, 768 insertions(+) create mode 100644 contrib/pg_stat_advisor/.gitignore create mode 100644 contrib/pg_stat_advisor/Makefile create mode 100644 contrib/pg_stat_advisor/README.md create mode 100644 contrib/pg_stat_advisor/expected/pg_stat_advisor.out create mode 100644 contrib/pg_stat_advisor/meson.build create mode 100644 contrib/pg_stat_advisor/pg_stat_advisor.c create mode 100644 contrib/pg_stat_advisor/sql/pg_stat_advisor.sql diff --git a/contrib/Makefile b/contrib/Makefile index da4e2316a3..da9a4ceeaa 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -34,6 +34,7 @@ SUBDIRS = \ pg_buffercache \ pg_freespacemap \ pg_prewarm \ + pg_stat_advisor \ pg_stat_statements \ pg_surgery \ pg_trgm \ diff --git a/contrib/meson.build b/contrib/meson.build index c12dc906ca..a20d99443b 100644 --- a/contrib/meson.build +++ b/contrib/meson.build @@ -49,6 +49,7 @@ subdir('pgcrypto') subdir('pg_freespacemap') subdir('pg_prewarm') subdir('pgrowlocks') +subdir('pg_stat_advisor') subdir('pg_stat_statements') subdir('pgstattuple') subdir('pg_surgery') diff --git a/contrib/pg_stat_advisor/.gitignore b/contrib/pg_stat_advisor/.gitignore new file mode 100644 index 00..913175ff6e --- /dev/null +++ b/contrib/pg_stat_advisor/.gitignore @@ -0,0 +1,3 @@ +/log/ +/results/ +/tmp_check/ diff --git a/contrib/pg_stat_advisor/Makefile b/contrib/pg_stat_advisor/Makefile new file mode 100644 index 00..f31b939e8a --- /dev/null +++ b/contrib/pg_stat_advisor/Makefile @@ -0,0 +1,20 @@ +# contrib/pg_stat_advisor/Makefile + +MODULE_big = pg_stat_advisor +OBJS = \ + $(WIN32RES) \ + pg_stat_advisor.o +PGFILEDESC = "pg_stat_advisor - analyze query performance and recommend the creation of additional statistics" + +REGRESS = pg_stat_advisor + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/pg_stat_advisor +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_stat_advisor/README.md b/contrib/pg_stat_advisor/README.md new file mode 100644 index 00..3f4e97f195 --- /dev/null +++ b/contrib/pg_stat_advisor/README.md @@ -0,0 +1,85 @@ +## pg_stat_adv
Re: pg_stat_advisor extension
Dear Team, Firstly, I would like to extend my sincere apologies for the confusion and technical oversights in our previous discussions regarding the 'pg_stat_advisor extension'. To address this and facilitate a clearer, more focused dialogue, I have initiated a new thread to consolidate our discussions on this matter. For context, our previous conversation can be found here: https://www.postgresql.org/message-id/flat/4681151706615977%40mail.yandex.ru. The extension 'pg_stat_advisor' extension is architected to optimize query plan. It operates by suggesting when to create extended statistics, particularly in queries where current selectivity estimates fall short. This is achieved through the GUC parameter 'pg_stat_advisor.suggest_statistics_threshold', which assesses the ratio of total tuples compared to the planned rows. This feature is instrumental in identifying scenarios where the planner's estimates could be optimized. You can install the extension by: ``` LOAD 'pg_stat_advisor' SET pg_stat_advisor.suggest_statistics_threshold = 1.0; ``` Example: ``` EXPLAIN ANALYZE SELECT * FROM t WHERE i = 100 AND j = 10; NOTICE: pg_stat_advisor suggestion: CREATE STATISTICS t_i_j ON i, j FROM t QUERY PLAN ``` After EXPLAIN ANALYZE command you can see the message of suggestion creating statistics with name 't_i_j' on 'i', 'j' columns from 't' table. Thank you for your understanding, patience, and continued support. Best regards, Ilia Evdokimov, Tantor Labs LLC.
Re: pg_stat_advisor extension
Hi hackers, I've encountered and addressed errors in the "0001-pg_stat_advisor-extension.patch" when applying it to the main branch, specifically trailing whitespace issues at lines 117 and 118: ``` 0001-pg_stat_advisor-extension.patch:117: trailing whitespace. QUERY PLAN 0001-pg_stat_advisor-extension.patch:118: trailing whitespace. warning: 2 lines add whitespace errors. ``` An updated patch is attached for review I welcome your insights, feedback, and evaluations regarding the necessity of integrating this new extension into PostgreSQL. Kind regards, Ilia Evdokimov, Tantor Labs LLC. From 6316706c42996219e507bb6ded9dd1e872180e38 Mon Sep 17 00:00:00 2001 From: Ilia Evdokimov Date: Tue, 6 Feb 2024 18:11:04 +0300 Subject: [PATCH] pg_stat_advisor extension --- contrib/Makefile | 1 + contrib/meson.build | 1 + contrib/pg_stat_advisor/.gitignore| 3 + contrib/pg_stat_advisor/Makefile | 20 + contrib/pg_stat_advisor/README.md | 85 .../expected/pg_stat_advisor.out | 96 contrib/pg_stat_advisor/meson.build | 30 ++ contrib/pg_stat_advisor/pg_stat_advisor.c | 477 ++ .../pg_stat_advisor/sql/pg_stat_advisor.sql | 50 ++ 9 files changed, 763 insertions(+) create mode 100644 contrib/pg_stat_advisor/.gitignore create mode 100644 contrib/pg_stat_advisor/Makefile create mode 100644 contrib/pg_stat_advisor/README.md create mode 100644 contrib/pg_stat_advisor/expected/pg_stat_advisor.out create mode 100644 contrib/pg_stat_advisor/meson.build create mode 100644 contrib/pg_stat_advisor/pg_stat_advisor.c create mode 100644 contrib/pg_stat_advisor/sql/pg_stat_advisor.sql diff --git a/contrib/Makefile b/contrib/Makefile index da4e2316a3..da9a4ceeaa 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -34,6 +34,7 @@ SUBDIRS = \ pg_buffercache \ pg_freespacemap \ pg_prewarm \ + pg_stat_advisor \ pg_stat_statements \ pg_surgery \ pg_trgm \ diff --git a/contrib/meson.build b/contrib/meson.build index c12dc906ca..a20d99443b 100644 --- a/contrib/meson.build +++ b/contrib/meson.build @@ -49,6 +49,7 @@ subdir('pgcrypto') subdir('pg_freespacemap') subdir('pg_prewarm') subdir('pgrowlocks') +subdir('pg_stat_advisor') subdir('pg_stat_statements') subdir('pgstattuple') subdir('pg_surgery') diff --git a/contrib/pg_stat_advisor/.gitignore b/contrib/pg_stat_advisor/.gitignore new file mode 100644 index 00..913175ff6e --- /dev/null +++ b/contrib/pg_stat_advisor/.gitignore @@ -0,0 +1,3 @@ +/log/ +/results/ +/tmp_check/ diff --git a/contrib/pg_stat_advisor/Makefile b/contrib/pg_stat_advisor/Makefile new file mode 100644 index 00..f31b939e8a --- /dev/null +++ b/contrib/pg_stat_advisor/Makefile @@ -0,0 +1,20 @@ +# contrib/pg_stat_advisor/Makefile + +MODULE_big = pg_stat_advisor +OBJS = \ + $(WIN32RES) \ + pg_stat_advisor.o +PGFILEDESC = "pg_stat_advisor - analyze query performance and recommend the creation of additional statistics" + +REGRESS = pg_stat_advisor + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/pg_stat_advisor +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_stat_advisor/README.md b/contrib/pg_stat_advisor/README.md new file mode 100644 index 00..f9610f2ed5 --- /dev/null +++ b/contrib/pg_stat_advisor/README.md @@ -0,0 +1,85 @@ +## pg_stat_advisor - PostgreSQL advisor to create statistics + +pg_stat_advisor is a PostgreSQL extension designed to analyze query performance and recommend the creation of additional statistics to improve query plan. + +Append pg_stat_advisor to shared_preload_libraries configuration parameter in your postgresql.conf file then restart the PostgreSQL database to apply the changes. Or you can use "LOAD 'pg_stat_advisor';"command +``` +LOAD 'pg_stat_advisor'; +``` + +There is the pg_stat_advisor.suggest_statistics_threshold GUC that can be used to set a suggest_statistics_threshold. It is the the ratio of total tuples produced compared to the planned rows. If parameter is set by 0, the printing switches off. + +For example: +``` +SET pg_stat_advisor.suggest_statistics_threshold = 1.0; +``` + +Examples: + + +``` +postgres=# create table t (i int, j int); +CREATE TABLE +postgres=# insert into t select i/10, i/100 from generate_series(1, 100) i; +INSERT 0 100 +postgres=# analyze t; +ANALYZE +postgres=# explain analyze select * from t where i = 100 and j = 10; + QUERY PLAN + +-- +-- + Gather (cost=1000.00..11675.10 rows=1 width=8) (actual time=0.526..61.564 rows=10 l
Re: pg_stat_advisor extension
Hi hackers, I'm reaching out again regarding the patch with new extension 'pg_stat_advisor' aimed at enhancing query plan efficiency through the suggestion of creating statistics. I understand the community is busy, but I would greatly value any feedback or thoughts on this extension. Thank you for your time and consideration. Best regards,Ilia Evdokimov,Tantor Labs LLC.