Replacing constants in pg_stat_statements is on a best effort basis.
It is not unlikely that on a busy workload with heavy entry deallocation,
the user may observe the query with the constants in pg_stat_statements.

From what I can see, this is because the only time an entry is normalized is
during post_parse_analyze, and the entry may be deallocated by the time query
execution ends. At that point, the original form ( with constants ) of the query
is used.

It is not clear how prevalent this is in real-world workloads, but it's easily 
reproducible
on a workload with high entry deallocation. Attached are the repro steps on the 
latest
branch.

I think the only thing to do here is to call this out in docs with a suggestion 
to increase
pg_stat_statements.max to reduce the likelihood. I also attached the suggested
doc enhancement as well.

Any thoughts?

Regards,

--
Sami Imseih
Amazon Web Services


### pg_stat_statements.max is min allowed
postgres=# show pg_stat_statements.max ;
 pg_stat_statements.max
------------------------
 100
(1 row)

### create 200 tables
do $$
begin
    for i in 1 .. 200
    loop
        execute 'drop table if exists foo'||i;
        execute 'create table foo'||i||'(id int, c2 text)';
    end loop;
end; $$ ;

### create a pgbench script to insert into the tables.
for (( i=1; i<=200; i++ ))
do 
   echo "INSERT INTO foo"$i" (id, c2) values (1, 'somedata');" >> 
/tmp/pgbench.sql
done

### run pgbench
pgbench -c 50 -f /tmp/pgbench.sql -T 1200


### observe pg_stat_statements
postgres=# select query from pg_stat_statements where query like '%foo%' and 
query not like '%$1%';
                        query
-----------------------------------------------------
 INSERT INTO foo31 (id, c2) values (1, 'somedata')
 INSERT INTO foo20 (id, c2) values (1, 'somedata')
 INSERT INTO foo191 (id, c2) values (1, 'somedata')
 INSERT INTO foo170 (id, c2) values (1, 'somedata')
 INSERT INTO foo167 (id, c2) values (1, 'somedata')
 INSERT INTO foo32 (id, c2) values (1, 'somedata')
 INSERT INTO foo36 (id, c2) values (1, 'somedata')
 INSERT INTO foo43 (id, c2) values (1, 'somedata')
 INSERT INTO foo181 (id, c2) values (1, 'somedata')
 INSERT INTO foo88 (id, c2) values (1, 'somedata')
(10 rows)

postgres=# select query from pg_stat_statements where query like '%foo%' and 
query  like '%$1%' limit 5;
                   query
--------------------------------------------
 INSERT INTO foo33 (id, c2) values ($1, $2)
 INSERT INTO foo59 (id, c2) values ($1, $2)
 INSERT INTO foo50 (id, c2) values ($1, $2)
 INSERT INTO foo42 (id, c2) values ($1, $2)
 INSERT INTO foo91 (id, c2) values ($1, $2)

### observe the # of deallocations
postgres=# select * from pg_stat_statements_info ;
 dealloc |          stats_reset
---------+-------------------------------
  113371 | 2023-02-24 06:24:21.912275-06
(1 row)

Attachment: 0001-doc-update-regarding-pg_stat_statements-normalizatio.patch
Description: 0001-doc-update-regarding-pg_stat_statements-normalizatio.patch

Reply via email to