Re: Michael Paquier > On Thu, Feb 01, 2024 at 07:37:32AM +0000, ma lz wrote: > > session 1: > > create temp table ttt ( a int ); > > insert into ttt values(3); -- query_id is XXX from > > pg_stat_activity > > > > session 2: > > create temp table ttt ( a int ); > > insert into ttt values(3); -- query_id is YYY from > > pg_stat_activity > > > > I know temp table has different oid, so query_id is different, is > > there a way to use table name for temp table instead of oid? > > The CREATE TABLE statements have indeed the same query ID (in 16~), > and the inserts have a different one as they touch different schemas > and relations. That's quite an old problem, that depends on the > RangeVar attached to an InsertStmt. I don't quite see a way to > directly handle that except by using a custom implementation in query > jumbling for this node and its RangeVar, so there is no "easy" way to > tackle that :/
A customer reported that pg_stat_statements is not useful for them because they are seeing 160k different query ids in 6-8 hours. They also proposed to use the temp table name for query jumbling and wrote a patch for it, which I would also see as the obvious solution to the problem. Here's that patch with regression tests added. I would think changing this would be a big usability improvement for anyone using temp tables a lot. There does not seem to be a performance impact - all test were run with pg_stat_statements active: Standard pgbench -S (-s 10): without patch: tps = 154155.407337 (without initial connection time) with patch: tps = 154223.966534 (without initial connection time) pgbench -S on temp tables where each table has just one record: without patch: tps = 184430.801954 (without initial connection time) with patch: tps = 185692.602764 (without initial connection time) Christoph
>From c50dbb614f5e7696cb687aa156eb4149dcdb231d Mon Sep 17 00:00:00 2001 From: Christoph Berg <m...@debian.org> Date: Mon, 17 Mar 2025 17:17:17 +0100 Subject: [PATCH v1] Jumble temp tables by name Query jumbling considers everything by oid, which is fine for regular objects. But for temp tables, which have to be recreated in each session (or even transaction), this means that the same temp table query run from the next session will never be aggregated in pg_stat_statements. Instead, the statistics are polluted with a large number of 1-call entries. Fix by using the temp table name instead. This has the risk of aggregating structurally different temp tables together if they same the same name, but practically, the queries will likely differ in other details anyway. And even if not, aggregating the entries in pg_stat_statements instead of polluting the stats seems the better choice. (The user has still the option to simply change the name of the temp table to have the queries separated. In the old scheme, the user does not have any chance to change behavior.) --- .../pg_stat_statements/expected/select.out | 31 ++++++++++++++++ contrib/pg_stat_statements/sql/select.sql | 12 +++++++ src/backend/nodes/queryjumblefuncs.c | 35 +++++++++++++++++++ src/include/nodes/parsenodes.h | 2 +- 4 files changed, 79 insertions(+), 1 deletion(-) diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out index 37a30af034a..8a7e237298c 100644 --- a/contrib/pg_stat_statements/expected/select.out +++ b/contrib/pg_stat_statements/expected/select.out @@ -241,6 +241,37 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; (12 rows) DROP TABLE pgss_a, pgss_b CASCADE; +-- temp tables +SELECT pg_stat_statements_reset() IS NOT NULL AS t; + t +--- + t +(1 row) + +BEGIN; + CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP; + SELECT * FROM temp_t; + id +---- +(0 rows) + +COMMIT; +BEGIN; + CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP; + SELECT * FROM temp_t; + id +---- +(0 rows) + +COMMIT; +SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls + calls | query +-------+------------------------------------------------------------------------ + 2 | SELECT * FROM temp_t + 0 | SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C" + 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t +(3 rows) + -- -- access to pg_stat_statements_info view -- diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql index e0be58d5e24..81b9d50ecec 100644 --- a/contrib/pg_stat_statements/sql/select.sql +++ b/contrib/pg_stat_statements/sql/select.sql @@ -90,6 +90,18 @@ SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; DROP TABLE pgss_a, pgss_b CASCADE; +-- temp tables +SELECT pg_stat_statements_reset() IS NOT NULL AS t; +BEGIN; + CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP; + SELECT * FROM temp_t; +COMMIT; +BEGIN; + CREATE TEMP TABLE temp_t (id int) ON COMMIT DROP; + SELECT * FROM temp_t; +COMMIT; +SELECT calls, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- one query with two calls + -- -- access to pg_stat_statements_info view -- diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c index b103a281936..bbddc92e28c 100644 --- a/src/backend/nodes/queryjumblefuncs.c +++ b/src/backend/nodes/queryjumblefuncs.c @@ -32,10 +32,12 @@ */ #include "postgres.h" +#include "catalog/namespace.h" #include "common/hashfn.h" #include "miscadmin.h" #include "nodes/queryjumble.h" #include "parser/scansup.h" +#include "utils/lsyscache.h" #define JUMBLE_SIZE 1024 /* query serialization buffer size */ @@ -58,6 +60,7 @@ static void _jumbleNode(JumbleState *jstate, Node *node); static void _jumbleA_Const(JumbleState *jstate, Node *node); static void _jumbleList(JumbleState *jstate, Node *node); static void _jumbleVariableSetStmt(JumbleState *jstate, Node *node); +static void _jumbleRangeTblEntry(JumbleState *jstate, Node *node); /* * Given a possibly multi-statement source string, confine our attention to the @@ -377,3 +380,35 @@ _jumbleVariableSetStmt(JumbleState *jstate, Node *node) JUMBLE_FIELD(is_local); JUMBLE_LOCATION(location); } + +static void +_jumbleRangeTblEntry(JumbleState *jstate, Node *node) +{ + char *rel_name = NULL; + RangeTblEntry *expr = (RangeTblEntry *) node; + + JUMBLE_FIELD(rtekind); + + /* + * If this is a temp table, jumble the name instead of the table oid. + */ + if (expr->rtekind == RTE_RELATION && isAnyTempNamespace(get_rel_namespace(expr->relid))) + { + rel_name = get_rel_name(expr->relid); + AppendJumble(jstate, (const unsigned char *)rel_name, strlen(rel_name)); + } + else + JUMBLE_FIELD(relid); + + JUMBLE_FIELD(inh); + JUMBLE_NODE(tablesample); + JUMBLE_NODE(subquery); + JUMBLE_FIELD(jointype); + JUMBLE_NODE(functions); + JUMBLE_FIELD(funcordinality); + JUMBLE_NODE(tablefunc); + JUMBLE_NODE(values_lists); + JUMBLE_STRING(ctename); + JUMBLE_FIELD(ctelevelsup); + JUMBLE_STRING(enrname); +} diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 23c9e3c5abf..deec95c7a26 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1039,7 +1039,7 @@ typedef enum RTEKind typedef struct RangeTblEntry { - pg_node_attr(custom_read_write) + pg_node_attr(custom_read_write, custom_query_jumble) NodeTag type; -- 2.47.2
pg_stat_statements loaded, with patch: + pgbench -s 10 -i + pgbench -S -c 10 -T 60 --progress=5 pgbench (18devel) starting vacuum...end. progress: 5.0 s, 143408.4 tps, lat 0.065 ms stddev 0.052, 0 failed progress: 10.0 s, 158556.2 tps, lat 0.060 ms stddev 0.035, 0 failed progress: 15.0 s, 157828.6 tps, lat 0.060 ms stddev 0.036, 0 failed progress: 20.0 s, 151674.9 tps, lat 0.062 ms stddev 0.045, 0 failed progress: 25.0 s, 141632.7 tps, lat 0.067 ms stddev 0.040, 0 failed progress: 30.0 s, 158796.8 tps, lat 0.060 ms stddev 0.036, 0 failed progress: 35.0 s, 151054.6 tps, lat 0.063 ms stddev 0.046, 0 failed progress: 40.0 s, 160659.6 tps, lat 0.059 ms stddev 0.035, 0 failed progress: 45.0 s, 153981.3 tps, lat 0.062 ms stddev 0.037, 0 failed progress: 50.0 s, 153105.7 tps, lat 0.062 ms stddev 0.049, 0 failed progress: 55.0 s, 159450.2 tps, lat 0.060 ms stddev 0.035, 0 failed progress: 60.0 s, 158979.5 tps, lat 0.060 ms stddev 0.035, 0 failed transaction type: <builtin: select only> scaling factor: 10 query mode: simple number of clients: 10 number of threads: 1 maximum number of tries: 1 duration: 60 s number of transactions actually processed: 9245654 number of failed transactions: 0 (0.000%) latency average = 0.062 ms latency stddev = 0.040 ms initial connection time = 50.616 ms tps = 154223.966534 (without initial connection time) pg_stat_statements loaded, without patch: + pgbench -s 10 -i + pgbench -S -c 10 -T 60 --progress=5 pgbench (18devel) starting vacuum...end. progress: 5.0 s, 144630.1 tps, lat 0.065 ms stddev 0.047, 0 failed progress: 10.0 s, 159638.0 tps, lat 0.060 ms stddev 0.034, 0 failed progress: 15.0 s, 158279.1 tps, lat 0.060 ms stddev 0.035, 0 failed progress: 20.0 s, 150433.4 tps, lat 0.063 ms stddev 0.045, 0 failed progress: 25.0 s, 159630.2 tps, lat 0.060 ms stddev 0.035, 0 failed progress: 30.0 s, 157365.3 tps, lat 0.061 ms stddev 0.036, 0 failed progress: 35.0 s, 148446.3 tps, lat 0.063 ms stddev 0.045, 0 failed progress: 40.0 s, 155013.4 tps, lat 0.061 ms stddev 0.037, 0 failed progress: 45.0 s, 154570.3 tps, lat 0.061 ms stddev 0.037, 0 failed progress: 50.0 s, 148595.7 tps, lat 0.063 ms stddev 0.049, 0 failed progress: 55.0 s, 155748.6 tps, lat 0.061 ms stddev 0.035, 0 failed progress: 60.0 s, 156308.7 tps, lat 0.061 ms stddev 0.036, 0 failed transaction type: <builtin: select only> scaling factor: 10 query mode: simple number of clients: 10 number of threads: 1 maximum number of tries: 1 duration: 60 s number of transactions actually processed: 9243315 number of failed transactions: 0 (0.000%) latency average = 0.061 ms latency stddev = 0.039 ms initial connection time = 39.253 ms tps = 154155.407337 (without initial connection time) The above numbers do not include temp tables. To test them, I hacked pgbench to use temp tables: create schema templ; alter table pgbench_accounts set schema templ ; alter table pgbench_branches set schema templ ; alter table pgbench_history set schema templ ; alter table pgbench_tellers set schema templ ; CREATE OR REPLACE FUNCTION public.login() RETURNS event_trigger LANGUAGE plpgsql AS $function$ begin create temp table pgbench_accounts as select * from templ.pgbench_accounts limit 1; create temp table pgbench_branches as select * from templ.pgbench_branches limit 1; create temp table pgbench_history as select * from templ.pgbench_history limit 1; create temp table pgbench_tellers as select * from templ.pgbench_tellers limit 1; end; $function$; create event trigger login on login execute function login(); temp tables, without patch + pgbench -S -c 10 -T 60 --progress=5 pgbench (18devel) starting vacuum...end. progress: 5.0 s, 179600.8 tps, lat 0.050 ms stddev 0.029, 0 failed progress: 10.0 s, 181251.3 tps, lat 0.051 ms stddev 0.032, 0 failed progress: 15.0 s, 184579.8 tps, lat 0.050 ms stddev 0.028, 0 failed progress: 20.0 s, 185340.0 tps, lat 0.050 ms stddev 0.027, 0 failed progress: 25.0 s, 170102.2 tps, lat 0.055 ms stddev 0.041, 0 failed progress: 30.0 s, 184247.0 tps, lat 0.051 ms stddev 0.028, 0 failed progress: 35.0 s, 187768.6 tps, lat 0.050 ms stddev 0.027, 0 failed progress: 40.0 s, 185517.6 tps, lat 0.050 ms stddev 0.034, 0 failed progress: 45.0 s, 186179.5 tps, lat 0.050 ms stddev 0.027, 0 failed progress: 50.0 s, 188005.0 tps, lat 0.049 ms stddev 0.027, 0 failed progress: 55.0 s, 184854.4 tps, lat 0.050 ms stddev 0.034, 0 failed progress: 60.0 s, 190576.0 tps, lat 0.049 ms stddev 0.027, 0 failed transaction type: <builtin: select only> scaling factor: 1 query mode: simple number of clients: 10 number of threads: 1 maximum number of tries: 1 duration: 60 s number of transactions actually processed: 11040125 number of failed transactions: 0 (0.000%) latency average = 0.050 ms latency stddev = 0.030 ms initial connection time = 139.621 ms tps = 184430.801954 (without initial connection time) Contrary to any "normal" workload, this still aggregates the 10 parallel sessions' temp table statements: query_id calls query 2249050047423342439 │ 1181385 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 6235824319853410599 │ 1166204 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 1699131257591049879 │ 1147717 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 -3332048971696272963 │ 1130036 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 -6043564579131403500 │ 1105202 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 6195010893428714792 │ 1095123 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 -6431540675815678490 │ 1077330 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 5106269170593406241 │ 1065739 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 -8943016353683653166 │ 1037755 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 -5518914812029529229 │ 1033634 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 with patch: + pgbench -S -c 10 -T 60 --progress=5 pgbench (18devel) starting vacuum...end. progress: 5.0 s, 181903.0 tps, lat 0.050 ms stddev 0.030, 0 failed progress: 10.0 s, 169692.8 tps, lat 0.054 ms stddev 0.039, 0 failed progress: 15.0 s, 189353.0 tps, lat 0.049 ms stddev 0.027, 0 failed progress: 20.0 s, 191183.2 tps, lat 0.049 ms stddev 0.027, 0 failed progress: 25.0 s, 184952.4 tps, lat 0.051 ms stddev 0.040, 0 failed progress: 30.0 s, 187466.2 tps, lat 0.050 ms stddev 0.028, 0 failed progress: 35.0 s, 190897.7 tps, lat 0.049 ms stddev 0.028, 0 failed progress: 40.0 s, 183873.0 tps, lat 0.051 ms stddev 0.044, 0 failed progress: 45.0 s, 187722.6 tps, lat 0.050 ms stddev 0.028, 0 failed progress: 50.0 s, 190621.6 tps, lat 0.049 ms stddev 0.028, 0 failed progress: 55.0 s, 184417.7 tps, lat 0.051 ms stddev 0.046, 0 failed progress: 60.0 s, 181640.5 tps, lat 0.051 ms stddev 0.030, 0 failed transaction type: <builtin: select only> scaling factor: 1 query mode: simple number of clients: 10 number of threads: 1 maximum number of tries: 1 duration: 60 s number of transactions actually processed: 11118630 number of failed transactions: 0 (0.000%) latency average = 0.050 ms latency stddev = 0.034 ms initial connection time = 123.604 ms tps = 185692.602764 (without initial connection time) Now there is only one record for all sessions: -8778269477398418330 │ 11118630 │ SELECT abalance FROM pgbench_accounts WHERE aid = $1 8104237783378095602 │ 12 │ create temp table pgbench_history as select * from templ.pgbench_history limit $1 -108440455751470927 │ 12 │ create temp table pgbench_branches as select * from templ.pgbench_branches limit $1 -4266677110910498212 │ 12 │ create temp table pgbench_accounts as select * from templ.pgbench_accounts limit $1 2367030003062126666 │ 12 │ create temp table pgbench_tellers as select * from templ.pgbench_tellers limit $1 Christoph