Hello psql-hackers!
TL:DR;
We extended the functionality of pg_stat_statements so it can track
worst and best case execution plans.
Based on a suggestion of my colleague Arne Scheffer, Marius Timmer and I
extended pg_stat_statements so it can also record execution plans,
whenever the execution time is exceeded (or deceeded) by a definable
factor.
We were largely inspired by the pg_stat_plans extension by Peter
Geoghegan and Simon Riggs - we don't claim any originality on this part
- which is unfortunately not available on newer postgresql versions.
There are a few differences which will become apparent in the following
lines.
By default, the modified pg_stat_statements extension will now track
good plans and bad plans for each entry in pg_stat_statements.
The plans are not normalized or hashed (as opposed to pg_stat_plans),
they represent discreet statements.
A good plan is saved, whenever this sort of query has been used for the
first time or the time of the previously recorded good plan has been
deceeded by a smaller factor than 0.9 .
Analogous to this, a bad_plan is saved, when the time has been exceeded
by a factor greater than 1.1 .
There are GUCs available so these parameters can be tuned to your
liking. Tracking can be disabled for both plans individually.
A plan_format can be defined to enable better readability or
processability through other tools.
You can reset your good and bad plans by using a
select on pg_stat_statements_good_plan_reset([queryid]);
resetting bad plans uses pg_stat_statements_bad_plan_reset, obviously.
In case of a reset, the execution time, timestamp and plan itself are
just set to 0 respective NULL.
The pg_stat_statements view now provides six extra columns:
good_plan, good_plan_time, good_plan_timestamp, bad_plan, bad_plan_time
and bad_plan_timestamp.
Plans are only displayed if the showtext argument is true and the user
is the superuser or the user who has been associated with that entry.
Furthermore, we implemented a GUC that allows you to control the maximum
refresh frequency to avoid performance impacts on restarts or resets.
A plan is only updated when tracking is enabled and more time than
"plan_min_interval" has passed (default: 5 seconds) and the previously
mentioned conditions for the execution time have been met.
The major selling point of this feature?
Beeing able to find plans that need optimization (e.g. by creating
indexes). As pg_stat_statements tracks normalized queries, there might
be certain values or even daytimes that result in very bad plans, while
others result in perfectly fine plans.
Of course, the GUC log_min_duration_statement can also detect long
runners, but the advantage of pg_stat_statements is that we count the
total calls of normalized queries, which enables us to find plans, that
don't count as long runners, while their aggregated time might show
shortcomings regarding their plans.
We've found this sort of tool really useful when dealing with queries
produced by ORM libraries, where optimization is not intuitive.
Various tests using pg_bench suggest that this extension does not worsen
the performance of the database.
We're really looking forward to your opinions and feedback on this
feature patch
Julian, Marius and Arne
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 298951a..2a22eb5 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,7 @@ MODULE_big = pg_stat_statements
OBJS = pg_stat_statements.o $(WIN32RES)
EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.3--1.4.sql \
+DATA = pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql pg_stat_statements--1.3--1.4.sql \
pg_stat_statements--1.2--1.3.sql pg_stat_statements--1.1--1.2.sql \
pg_stat_statements--1.0--1.1.sql pg_stat_statements--unpackaged--1.0.sql
PGFILEDESC = "pg_stat_statements - execution statistics of SQL statements"
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6edc3d9..a3cfe6d 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -61,7 +61,9 @@
#include <sys/stat.h>
#include <unistd.h>
+#include "utils/timestamp.h"
#include "access/hash.h"
+#include "commands/explain.h"
#include "executor/instrument.h"
#include "funcapi.h"
#include "mb/pg_wchar.h"
@@ -118,7 +120,8 @@ typedef enum pgssVersion
PGSS_V1_0 = 0,
PGSS_V1_1,
PGSS_V1_2,
- PGSS_V1_3
+ PGSS_V1_3,
+ PGSS_V1_5
} pgssVersion;
/*
@@ -159,6 +162,14 @@ typedef struct Counters
double usage; /* usage factor */
} Counters;
+typedef struct pgssPlan
+{
+ Size offset;
+ int len;
+ double time; /* execution time in msec when the latest plan was updated */
+ TimestampTz timestamp;
+} pgssPlan;
+
/*
* Statistics per statement
*
@@ -172,6 +183,8 @@ typedef struct pgssEntry
Counters counters; /* the statistics for this query */
Size query_offset; /* query text offset in external file */
int query_len; /* # of valid bytes in query string, or -1 */
+ pgssPlan good_plan;
+ pgssPlan bad_plan;
int encoding; /* query text encoding */
slock_t mutex; /* protects the counters only */
} pgssEntry;
@@ -183,7 +196,7 @@ typedef struct pgssSharedState
{
LWLock *lock; /* protects hashtable search/modification */
double cur_median_usage; /* current median usage in hashtable */
- Size mean_query_len; /* current mean entry text length */
+ Size mean_text_len; /* current mean entry text length */
slock_t mutex; /* protects following fields only: */
Size extent; /* current extent of query file */
int n_writers; /* number of active writers to query file */
@@ -256,10 +269,25 @@ static const struct config_enum_entry track_options[] =
{NULL, 0, false}
};
-static int pgss_max; /* max # statements to track */
-static int pgss_track; /* tracking level */
-static bool pgss_track_utility; /* whether to track utility commands */
-static bool pgss_save; /* whether to save stats across shutdown */
+static const struct config_enum_entry format_options[] =
+{
+ {"text", EXPLAIN_FORMAT_TEXT, false},
+ {"xml", EXPLAIN_FORMAT_XML, false},
+ {"json", EXPLAIN_FORMAT_JSON, false},
+ {"yaml", EXPLAIN_FORMAT_YAML, false},
+ {NULL, 0, false}
+};
+
+static int pgss_max; /* max # statements to track */
+static int pgss_track; /* tracking level */
+static bool pgss_track_utility; /* whether to track utility commands */
+static bool pgss_save; /* whether to save stats across shutdown */
+static bool pgss_good_plan_enable; /* whether to save good_plans */
+static bool pgss_bad_plan_enable; /* whether to save good_plans */
+static double pgss_good_plan_delta; /* min derivation to save a new good_plan */
+static double pgss_bad_plan_delta; /* min derivation to save a new bad_plan */
+static int pgss_plan_min_interval; /* minimum time in seconds that must pass so a new plan could be saved */
+static int pgss_plan_format; /* id which sets the output format */
#define pgss_enabled() \
@@ -282,7 +310,11 @@ void _PG_fini(void);
PG_FUNCTION_INFO_V1(pg_stat_statements_reset);
PG_FUNCTION_INFO_V1(pg_stat_statements_1_2);
PG_FUNCTION_INFO_V1(pg_stat_statements_1_3);
+PG_FUNCTION_INFO_V1(pg_stat_statements_1_5);
PG_FUNCTION_INFO_V1(pg_stat_statements);
+PG_FUNCTION_INFO_V1(pg_stat_statements_good_plan_reset);
+PG_FUNCTION_INFO_V1(pg_stat_statements_bad_plan_reset);
+
static void pgss_shmem_startup(void);
static void pgss_shmem_shutdown(int code, Datum arg);
@@ -301,12 +333,17 @@ static int pgss_match_fn(const void *key1, const void *key2, Size keysize);
static uint32 pgss_hash_string(const char *str, int len);
static void pgss_store(const char *query, uint32 queryId,
int query_location, int query_len,
+ const char *plan,
double total_time, uint64 rows,
const BufferUsage *bufusage,
pgssJumbleState *jstate);
+Datum pg_stat_statements_plan_reset(uint32 query_id, uint8 plan_type);
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
+static int fill_plan_times(Datum values[], bool nulls[], int i, pgssPlan *plan);
+static int fill_plan_str(Datum values[], bool nulls[], int i, pgssPlan *plan,
+ pgssEntry *entry, char *qbuffer, Size qbuffer_size);
static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
@@ -331,7 +368,6 @@ static void fill_in_constant_lengths(pgssJumbleState *jstate, const char *query,
int query_loc);
static int comp_location(const void *a, const void *b);
-
/*
* Module load callback
*/
@@ -399,6 +435,80 @@ _PG_init(void)
NULL,
NULL);
+ DefineCustomBoolVariable("pg_stat_statements.good_plan_enable",
+ "Enable bad plan detection",
+ NULL,
+ &pgss_good_plan_enable,
+ true,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomBoolVariable("pg_stat_statements.bad_plan_enable",
+ "Enable bad plan detection",
+ NULL,
+ &pgss_bad_plan_enable,
+ true,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomRealVariable("pg_stat_statements.good_plan_delta",
+ "If execution time deceeds last good_plan_time by a factor smaller than this, the plan will be stored.",
+ NULL,
+ &pgss_good_plan_delta,
+ 0.9,
+ 0.0,
+ 1.0,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomRealVariable("pg_stat_statements.bad_plan_delta",
+ "If execution time exceeds last bad_plan_time by a factor greater than this, the plan will be stored.",
+ NULL,
+ &pgss_bad_plan_delta,
+ 1.1,
+ 1.01,
+ 10.0,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomIntVariable("pg_stat_statements.plan_min_interval",
+ "Sets the time in seconds that has to pass before a new plan could be saved.",
+ NULL,
+ &pgss_plan_min_interval,
+ 5,
+ 0,
+ INT_MAX,
+ PGC_POSTMASTER,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomEnumVariable("pg_stat_statements.plan_format",
+ "Sets the output format for the plans.",
+ "Notice that the plan format can not be changed after a plan is saved. Valid values are text, json, xml and yaml.",
+ &pgss_plan_format,
+ EXPLAIN_FORMAT_TEXT,
+ format_options,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+
EmitWarningsOnPlaceholders("pg_stat_statements");
/*
@@ -462,7 +572,11 @@ pgss_shmem_startup(void)
int32 pgver;
int32 i;
int buffer_size;
- char *buffer = NULL;
+ int good_plan_buffer_size;
+ int bad_plan_buffer_size;
+ char *query_buffer = NULL;
+ char *good_plan_buffer = NULL;
+ char *bad_plan_buffer = NULL;
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
@@ -485,7 +599,7 @@ pgss_shmem_startup(void)
/* First time through ... */
pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ pgss->mean_text_len = ASSUMED_LENGTH_INIT;
SpinLockInit(&pgss->mutex);
pgss->extent = 0;
pgss->n_writers = 0;
@@ -555,7 +669,11 @@ pgss_shmem_startup(void)
}
buffer_size = 2048;
- buffer = (char *) palloc(buffer_size);
+ good_plan_buffer_size = 2048;
+ bad_plan_buffer_size = 2048;
+ query_buffer = (char *) palloc(buffer_size);
+ good_plan_buffer = (char *) palloc(good_plan_buffer_size);
+ bad_plan_buffer = (char *) palloc(bad_plan_buffer_size);
if (fread(&header, sizeof(uint32), 1, file) != 1 ||
fread(&pgver, sizeof(uint32), 1, file) != 1 ||
@@ -583,14 +701,42 @@ pgss_shmem_startup(void)
if (temp.query_len >= buffer_size)
{
buffer_size = Max(buffer_size * 2, temp.query_len + 1);
- buffer = repalloc(buffer, buffer_size);
+ query_buffer = repalloc(query_buffer, buffer_size);
}
- if (fread(buffer, 1, temp.query_len + 1, file) != temp.query_len + 1)
+ if (fread(query_buffer, 1, temp.query_len + 1, file)
+ != temp.query_len + 1)
goto read_error;
+ /* Resize good plan_buffer and read into it */
+ if (temp.good_plan.len >= good_plan_buffer_size)
+ {
+ good_plan_buffer_size =
+ Max(good_plan_buffer_size * 2, temp.good_plan.len + 1);
+ good_plan_buffer= repalloc(good_plan_buffer, good_plan_buffer_size);
+ }
+
+ if (fread(good_plan_buffer, 1, temp.good_plan.len + 1, file)
+ != temp.good_plan.len + 1)
+ goto read_error;
+
+ /* Resize bad plan_buffer and read into it */
+ if (temp.bad_plan.len >= bad_plan_buffer_size)
+ {
+ bad_plan_buffer_size =
+ Max(bad_plan_buffer_size * 2, temp.bad_plan.len + 1);
+ bad_plan_buffer= repalloc(bad_plan_buffer, bad_plan_buffer_size);
+ }
+
+ if (fread(bad_plan_buffer, 1, temp.bad_plan.len + 1, file)
+ != temp.bad_plan.len + 1)
+ goto read_error;
+
+
/* Should have a trailing null, but let's make sure */
- buffer[temp.query_len] = '\0';
+ query_buffer[temp.query_len] = '\0';
+ good_plan_buffer[temp.good_plan.len] = '\0';
+ bad_plan_buffer[temp.bad_plan.len] = '\0';
/* Skip loading "sticky" entries */
if (temp.counters.calls == 0)
@@ -598,10 +744,25 @@ pgss_shmem_startup(void)
/* Store the query text */
query_offset = pgss->extent;
- if (fwrite(buffer, 1, temp.query_len + 1, qfile) != temp.query_len + 1)
+ if (fwrite(query_buffer, 1, temp.query_len + 1, qfile)
+ != temp.query_len + 1)
goto write_error;
pgss->extent += temp.query_len + 1;
+ /* Store the good plan text*/
+ temp.good_plan.offset = pgss->extent;
+ if (fwrite(good_plan_buffer, 1, temp.good_plan.len + 1, qfile)
+ != temp.good_plan.len + 1)
+ goto write_error;
+ pgss->extent += temp.good_plan.len + 1;
+
+ /* Store the bad plan text*/
+ temp.bad_plan.offset = pgss->extent;
+ if (fwrite(bad_plan_buffer, 1, temp.bad_plan.len + 1, qfile)
+ != temp.bad_plan.len + 1)
+ goto write_error;
+ pgss->extent += temp.bad_plan.len + 1;
+
/* make the hashtable entry (discards old entries if too many) */
entry = entry_alloc(&temp.key, query_offset, temp.query_len,
temp.encoding,
@@ -609,9 +770,15 @@ pgss_shmem_startup(void)
/* copy in the actual stats */
entry->counters = temp.counters;
+ /* copy in the plans */
+ entry->good_plan = temp.good_plan;
+ entry->bad_plan = temp.bad_plan;
+
}
- pfree(buffer);
+ pfree(query_buffer);
+ pfree(good_plan_buffer);
+ pfree(bad_plan_buffer);
FreeFile(file);
FreeFile(qfile);
@@ -650,8 +817,12 @@ write_error:
errmsg("could not write pg_stat_statement file \"%s\": %m",
PGSS_TEXT_FILE)));
fail:
- if (buffer)
- pfree(buffer);
+ if (query_buffer)
+ pfree(query_buffer);
+ if (good_plan_buffer)
+ pfree(good_plan_buffer);
+ if (bad_plan_buffer)
+ pfree(bad_plan_buffer);
if (file)
FreeFile(file);
if (qfile)
@@ -716,15 +887,52 @@ pgss_shmem_shutdown(int code, Datum arg)
hash_seq_init(&hash_seq, pgss_hash);
while ((entry = hash_seq_search(&hash_seq)) != NULL)
{
- int len = entry->query_len;
- char *qstr = qtext_fetch(entry->query_offset, len,
+ int query_len = entry->query_len;
+ int good_plan_len = entry->good_plan.len;
+ int bad_plan_len = entry->bad_plan.len;
+ char *query_string = qtext_fetch(entry->query_offset, query_len,
qbuffer, qbuffer_size);
+ char *good_plan_str;
+ char *bad_plan_str;
+ if (good_plan_len > 0)
+ {
+ /* A good plan is available */
+ good_plan_str = qtext_fetch(entry->good_plan.offset, good_plan_len,
+ qbuffer, qbuffer_size);
+ } else {
+ /*
+ * There is no good plan available. This can be caused by a query
+ * which has no plan (for example insert or delete queries) or
+ * because good plans are disabled. In this case we have to store
+ * an empty string instead of null.
+ */
+ good_plan_str = palloc(1 * sizeof(char));
+ *good_plan_str = '\0';
+ }
+ if (bad_plan_len > 0)
+ {
+ /* A bad plan is available */
+ bad_plan_str = qtext_fetch(entry->bad_plan.offset, bad_plan_len,
+ qbuffer, qbuffer_size);
+ } else {
+ /*
+ * There is no bad plan available. This can be caused by a query
+ * which has no plan (for example insert or delete queries) or
+ * because bad plans are disabled. In this case we have to store
+ * an empty string instead of null.
+ */
+ bad_plan_str = palloc(1 * sizeof(char));
+ *bad_plan_str = '\0';
+ }
- if (qstr == NULL)
+ if (query_string == NULL)
continue; /* Ignore any entries with bogus texts */
+ /* Write entrys, querys and plans serialized to the dump file */
if (fwrite(entry, sizeof(pgssEntry), 1, file) != 1 ||
- fwrite(qstr, 1, len + 1, file) != len + 1)
+ fwrite(query_string, 1, query_len + 1, file) != query_len + 1 ||
+ fwrite(good_plan_str, 1, good_plan_len + 1, file) != good_plan_len + 1 ||
+ fwrite(bad_plan_str, 1, bad_plan_len + 1, file) != bad_plan_len + 1)
{
/* note: we assume hash_seq_term won't change errno */
hash_seq_term(&hash_seq);
@@ -827,6 +1035,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
query->queryId,
query->stmt_location,
query->stmt_len,
+ NULL,
0,
0,
NULL,
@@ -923,16 +1132,40 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
if (queryId != 0 && queryDesc->totaltime && pgss_enabled())
{
+ ExplainState *es = NewExplainState();
/*
* Make sure stats accumulation is done. (Note: it's okay if several
* levels of hook all do this.)
*/
InstrEndLoop(queryDesc->totaltime);
+ es->analyze = (queryDesc->instrument_options && false);
+ es->verbose = false;
+ es->buffers = (es->analyze && false);
+ es->timing = (es->analyze && true);
+ es->format = pgss_plan_format;
+
+ ExplainBeginOutput(es);
+ ExplainQueryText(es, queryDesc);
+ ExplainPrintPlan(es, queryDesc);
+ ExplainEndOutput(es);
+
+ /* Remove last line break */
+ if (es->str->len > 0 && es->str->data[es->str->len - 1] == '\n')
+ es->str->data[--es->str->len] = '\0';
+
+ /* Fix JSON to output an object */
+ if (pgss_plan_format == EXPLAIN_FORMAT_JSON)
+ {
+ es->str->data[0] = '{';
+ es->str->data[es->str->len - 1] = '}';
+ }
+
pgss_store(queryDesc->sourceText,
queryId,
queryDesc->plannedstmt->stmt_location,
queryDesc->plannedstmt->stmt_len,
+ es->str->data, /* plan */
queryDesc->totaltime->total * 1000.0, /* convert to msec */
queryDesc->estate->es_processed,
&queryDesc->totaltime->bufusage,
@@ -1043,6 +1276,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
0, /* signal that it's a utility stmt */
pstmt->stmt_location,
pstmt->stmt_len,
+ NULL,
INSTR_TIME_GET_MILLISEC(duration),
rows,
&bufusage,
@@ -1102,6 +1336,28 @@ pgss_hash_string(const char *str, int len)
return hash_any((const unsigned char *) str, len);
}
+static void
+update_plan(volatile pgssPlan* pgssPlan_ptr, const char* plan_str, int plan_len,
+ double total_time)
+{
+ if (plan_len > 0)
+ {
+ long secs;
+ int msecs;
+ TimestampDifference(pgssPlan_ptr->timestamp, GetCurrentTimestamp(), &secs, &msecs);
+ if(secs > pgss_plan_min_interval)
+ {
+ Size plan_offset;
+ qtext_store(plan_str, plan_len, &plan_offset, NULL);
+ pgssPlan_ptr->offset = plan_offset;
+ pgssPlan_ptr->len = plan_len;
+ pgssPlan_ptr->time = total_time;
+ pgssPlan_ptr->timestamp = GetCurrentTimestamp();
+ }
+ }
+ return;
+}
+
/*
* Store some statistics for a statement.
*
@@ -1115,6 +1371,7 @@ pgss_hash_string(const char *str, int len)
static void
pgss_store(const char *query, uint32 queryId,
int query_location, int query_len,
+ const char *plan,
double total_time, uint64 rows,
const BufferUsage *bufusage,
pgssJumbleState *jstate)
@@ -1123,6 +1380,7 @@ pgss_store(const char *query, uint32 queryId,
pgssEntry *entry;
char *norm_query = NULL;
int encoding = GetDatabaseEncoding();
+ int plan_len = 0;
Assert(query != NULL);
@@ -1153,6 +1411,9 @@ pgss_store(const char *query, uint32 queryId,
query_len = strlen(query);
}
+ if(plan)
+ plan_len = strlen(plan);
+
/*
* Discard leading and trailing whitespace, too. Use scanner_isspace()
* not libc's isspace(), because we want to match the lexer's behavior.
@@ -1206,7 +1467,6 @@ pgss_store(const char *query, uint32 queryId,
/* Append new query text to file with only shared lock held */
stored = qtext_store(norm_query ? norm_query : query, query_len,
&query_offset, &gc_count);
-
/*
* Determine whether we need to garbage collect external query texts
* while the shared lock is still held. This micro-optimization
@@ -1233,7 +1493,12 @@ pgss_store(const char *query, uint32 queryId,
if (!stored)
goto done;
- /* OK to create a new hashtable entry */
+ /*
+ * OK to create a new hashtable entry.
+ * We have to set the good- and the bad planlength to zero. Otherwise the
+ * pgss_shmem_startup method will expect this number of chars which crashes
+ * if there is no plan yet.
+ */
entry = entry_alloc(&key, query_offset, query_len, encoding,
jstate != NULL);
@@ -1253,6 +1518,21 @@ pgss_store(const char *query, uint32 queryId,
SpinLockAcquire(&e->mutex);
+ /*
+ * In case of a fast or initial plan we store a new good plan. We
+ * are doing the same procedur for bad plans. So we store a empty
+ * query, one empty and one full plan in the text file. After a
+ * restart the new generated textfile will be "defragmented".
+ */
+ if (pgss_good_plan_enable &&
+ ((total_time < (e->good_plan.time * pgss_good_plan_delta)) ||
+ e->good_plan.time == 0))
+ update_plan(&e->good_plan, plan, plan_len, total_time);
+ if (pgss_bad_plan_enable &&
+ ((total_time > (e->bad_plan.time * pgss_bad_plan_delta)) ||
+ e->bad_plan.time == 0))
+ update_plan(&e->bad_plan, plan, plan_len, total_time);
+
/* "Unstick" entry if it was previously sticky */
if (e->counters.calls == 0)
e->counters.usage = USAGE_INIT;
@@ -1324,12 +1604,89 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
PG_RETURN_VOID();
}
+Datum
+pg_stat_statements_plan_reset(uint32 query_id, uint8 plan_type)
+{
+ pgssEntry *entry;
+
+ pgssHashKey key;
+
+ if (!pgss || !pgss_hash)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
+
+ /* Set up key for hashtable search */
+ key.userid = GetUserId();
+ key.dbid = MyDatabaseId;
+ key.queryid = query_id;
+
+ /* Lookup the hash table entry with shared lock. */
+ LWLockAcquire(pgss->lock, LW_SHARED);
+
+ entry = (pgssEntry *) hash_search(pgss_hash, &key, HASH_FIND, NULL);
+
+ if (entry)
+ {
+ if(superuser() || entry->key.userid == key.userid)
+ {
+ pgssPlan *pgssPlan_ptr;
+ switch (plan_type)
+ {
+ case 1:
+ pgssPlan_ptr = &entry->good_plan;
+ break;
+ case 2:
+ pgssPlan_ptr = &entry->bad_plan;
+ break;
+ default:
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg_internal("unrecognized plan type: %d",
+ plan_type)));
+ break;
+ }
+ if(pgssPlan_ptr)
+ {
+ pgssPlan_ptr->offset = -1;
+ pgssPlan_ptr->len = 0;
+ pgssPlan_ptr->time = 0;
+ pgssPlan_ptr->timestamp = 0;
+ }
+ }else{
+ elog(ERROR, "insufficient permission to reset plan");
+ }
+ }
+ LWLockRelease(pgss->lock);
+ PG_RETURN_VOID();
+}
+
+/*
+ * Reset the good_plan of the entry with queryid.
+ */
+ Datum
+ pg_stat_statements_good_plan_reset(PG_FUNCTION_ARGS)
+ {
+ return pg_stat_statements_plan_reset(PG_GETARG_INT64(0), 1);
+ }
+
+/*
+ * Reset the bad_plan of the entry with queryid.
+ */
+ Datum
+ pg_stat_statements_bad_plan_reset(PG_FUNCTION_ARGS)
+ {
+ return pg_stat_statements_plan_reset(PG_GETARG_INT64(0), 2);
+
+ }
+
/* Number of output arguments (columns) for various API versions */
#define PG_STAT_STATEMENTS_COLS_V1_0 14
#define PG_STAT_STATEMENTS_COLS_V1_1 18
#define PG_STAT_STATEMENTS_COLS_V1_2 19
#define PG_STAT_STATEMENTS_COLS_V1_3 23
-#define PG_STAT_STATEMENTS_COLS 23 /* maximum of above */
+#define PG_STAT_STATEMENTS_COLS_V1_5 29
+#define PG_STAT_STATEMENTS_COLS 29 /* maximum of above */
/*
* Retrieve statement statistics.
@@ -1342,6 +1699,16 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
* function. Unfortunately we weren't bright enough to do that for 1.1.
*/
Datum
+pg_stat_statements_1_5(PG_FUNCTION_ARGS)
+{
+ bool showtext = PG_GETARG_BOOL(0);
+
+ pg_stat_statements_internal(fcinfo, PGSS_V1_5, showtext);
+
+ return (Datum) 0;
+}
+
+Datum
pg_stat_statements_1_3(PG_FUNCTION_ARGS)
{
bool showtext = PG_GETARG_BOOL(0);
@@ -1444,6 +1811,10 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
if (api_version != PGSS_V1_3)
elog(ERROR, "incorrect number of output arguments");
break;
+ case PG_STAT_STATEMENTS_COLS_V1_5:
+ if (api_version != PGSS_V1_5)
+ elog(ERROR, "incorrect number of output arguments");
+ break;
default:
elog(ERROR, "incorrect number of output arguments");
}
@@ -1527,6 +1898,8 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
bool nulls[PG_STAT_STATEMENTS_COLS];
int i = 0;
Counters tmp;
+ pgssPlan tmp_good_plan;
+ pgssPlan tmp_bad_plan;
double stddev;
int64 queryid = entry->key.queryid;
@@ -1543,7 +1916,8 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
if (showtext)
{
- char *qstr = qtext_fetch(entry->query_offset,
+ char *qstr = qtext_fetch
+ (entry->query_offset,
entry->query_len,
qbuffer,
qbuffer_size);
@@ -1566,11 +1940,29 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
/* Just return a null if we fail to find the text */
nulls[i++] = true;
}
+
+ if (api_version >= PGSS_V1_5)
+ {
+ /*
+ * This version requires to print out the good
+ * and the bad plan if they are enabled.
+ */
+
+ i = fill_plan_str(values, nulls, i, &entry->good_plan,
+ entry, qbuffer, qbuffer_size);
+ i = fill_plan_str(values, nulls, i, &entry->bad_plan,
+ entry, qbuffer, qbuffer_size);
+ }
}
else
{
- /* Query text not requested */
- nulls[i++] = true;
+ /*
+ * Query text not requested
+ */
+ for (int j = 0; j < (api_version >= PGSS_V1_5 ? 3 : 1) ; j++)
+ {
+ nulls[i++] = true;
+ }
}
}
else
@@ -1595,6 +1987,8 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
SpinLockAcquire(&e->mutex);
tmp = e->counters;
+ tmp_good_plan = e->good_plan;
+ tmp_bad_plan = e->bad_plan;
SpinLockRelease(&e->mutex);
}
@@ -1622,6 +2016,14 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
stddev = 0.0;
values[i++] = Float8GetDatumFast(stddev);
}
+
+ if (api_version >= PGSS_V1_5)
+ {
+ /* Set plan times and timestamps (to avoid code duplication this iterator has been added*/
+ i = fill_plan_times(values, nulls, i, &tmp_good_plan);
+ i = fill_plan_times(values, nulls, i, &tmp_bad_plan);
+ }
+
values[i++] = Int64GetDatumFast(tmp.rows);
values[i++] = Int64GetDatumFast(tmp.shared_blks_hit);
values[i++] = Int64GetDatumFast(tmp.shared_blks_read);
@@ -1645,6 +2047,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
api_version == PGSS_V1_1 ? PG_STAT_STATEMENTS_COLS_V1_1 :
api_version == PGSS_V1_2 ? PG_STAT_STATEMENTS_COLS_V1_2 :
api_version == PGSS_V1_3 ? PG_STAT_STATEMENTS_COLS_V1_3 :
+ api_version == PGSS_V1_5 ? PG_STAT_STATEMENTS_COLS_V1_5 :
-1 /* fail if you forget to update this assert */ ));
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
@@ -1659,6 +2062,53 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
tuplestore_donestoring(tupstore);
}
+/* This is only a helper-function to pg_stat_statements_internal */
+static int
+fill_plan_times(Datum values[], bool nulls[], int i, pgssPlan *plan)
+{
+ values[i++] = Float8GetDatumFast(plan->time);
+
+ /* if there is noting in the timestamp field, we are not interested in it */
+ if (plan->timestamp)
+ values[i++] = TimestampTzGetDatum(plan->timestamp);
+ else
+ nulls[i++] = true;
+ return i;
+}
+
+static int
+fill_plan_str(Datum values[], bool nulls[], int i, pgssPlan *plan,
+ pgssEntry *entry, char *qbuffer, Size qbuffer_size)
+{
+ if (plan && plan->len > 0)
+ {
+ char *pstr = qtext_fetch(plan->offset,
+ plan->len,
+ qbuffer,
+ qbuffer_size);
+ if (pstr)
+ {
+ char *enc;
+ enc = pg_any_to_server(pstr,
+ plan->len,
+ entry->encoding);
+ values[i++] = CStringGetTextDatum(enc);
+
+ if (enc != pstr)
+ pfree(enc);
+ } else {
+ /* failed to get the string of the plan */
+ nulls[i++] = true;
+ }
+ } else {
+ /* no plan available or plan_len not greater than 0 */
+ nulls[i++] = true;
+ }
+ return i;
+}
+
+
+
/*
* Estimate shared memory space needed.
*/
@@ -1800,9 +2250,9 @@ entry_dealloc(void)
pgss->cur_median_usage = entries[i / 2]->counters.usage;
/* Record the mean query length */
if (nvalidtexts > 0)
- pgss->mean_query_len = tottextlen / nvalidtexts;
+ pgss->mean_text_len = tottextlen / nvalidtexts;
else
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ pgss->mean_text_len = ASSUMED_LENGTH_INIT;
/* Now zap an appropriate fraction of lowest-usage entries */
nvictims = Max(10, i * USAGE_DEALLOC_PERCENT / 100);
@@ -2030,8 +2480,8 @@ need_gc_qtexts(void)
SpinLockRelease(&s->mutex);
}
- /* Don't proceed if file does not exceed 512 bytes per possible entry */
- if (extent < 512 * pgss_max)
+ /* Don't proceed if file does not exceed 2048 bytes per possible entry */
+ if (extent < 2048 * pgss_max)
return false;
/*
@@ -2041,7 +2491,7 @@ need_gc_qtexts(void)
* query length in order to prevent garbage collection from thrashing
* uselessly.
*/
- if (extent < pgss->mean_query_len * pgss_max * 2)
+ if (extent < pgss->mean_text_len * pgss_max * 2)
return false;
return true;
@@ -2121,11 +2571,29 @@ gc_qtexts(void)
qbuffer,
qbuffer_size);
- if (qry == NULL)
+ int good_plan_len = entry->good_plan.len;
+ char *good_plan = qtext_fetch(entry->good_plan.offset,
+ good_plan_len,
+ qbuffer,
+ qbuffer_size);
+
+ int bad_plan_len = entry->bad_plan.len;
+ char *bad_plan = qtext_fetch(entry->bad_plan.offset,
+ bad_plan_len,
+ qbuffer,
+ qbuffer_size);
+ /* an "empty" string */
+ char empty_str[1] = "\0";
+
+ if (qry == NULL || good_plan == NULL || bad_plan == NULL)
{
- /* Trouble ... drop the text */
+ /* Trouble occured; Drop the query and the two plan texts */
entry->query_offset = 0;
entry->query_len = -1;
+ entry->good_plan.offset = 0;
+ entry->good_plan.len = -1;
+ entry->bad_plan.offset = 0;
+ entry->bad_plan.len = -1;
/* entry will not be counted in mean query length computation */
continue;
}
@@ -2142,6 +2610,43 @@ gc_qtexts(void)
entry->query_offset = extent;
extent += query_len + 1;
+
+ if(good_plan_len <= 0)
+ {
+ good_plan = empty_str;
+ good_plan_len = 0;
+ }
+ /* Save the good plan */
+ if (fwrite(good_plan, 1, good_plan_len + 1, qfile) != good_plan_len + 1)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write best plan pg_stat_statement file \"%s\": %m",
+ PGSS_TEXT_FILE)));
+ hash_seq_term(&hash_seq);
+ goto gc_fail;
+ }
+ entry->good_plan.offset = extent;
+ extent += good_plan_len + 1;
+
+ if(bad_plan_len <= 0)
+ {
+ bad_plan = empty_str;
+ bad_plan_len = 0;
+ }
+ /* Save the bad plan */
+ if (fwrite(bad_plan, 1, bad_plan_len + 1, qfile) != bad_plan_len + 1)
+ {
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write worst plan pg_stat_statement file \"%s\": %m",
+ PGSS_TEXT_FILE)));
+ hash_seq_term(&hash_seq);
+ goto gc_fail;
+ }
+ entry->bad_plan.offset = extent;
+ extent += bad_plan_len + 1;
+
nentries++;
}
@@ -2176,9 +2681,9 @@ gc_qtexts(void)
* won't still think we have a problem.
*/
if (nentries > 0)
- pgss->mean_query_len = extent / nentries;
+ pgss->mean_text_len = extent / nentries;
else
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ pgss->mean_text_len = ASSUMED_LENGTH_INIT;
free(qbuffer);
@@ -2228,7 +2733,7 @@ gc_fail:
pgss->extent = 0;
/* Reset mean_query_len to match the new state */
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ pgss->mean_text_len = ASSUMED_LENGTH_INIT;
/*
* Bump the GC count even though we failed.
diff --git a/contrib/pg_stat_statements/pg_stat_statements.control b/contrib/pg_stat_statements/pg_stat_statements.control
index 24038f5..193fcdf 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.control
+++ b/contrib/pg_stat_statements/pg_stat_statements.control
@@ -1,5 +1,5 @@
# pg_stat_statements extension
comment = 'track execution statistics of all SQL statements executed'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_stat_statements'
relocatable = true
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index d4f09fc..6207045 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -82,6 +82,20 @@
<entry>Text of a representative statement</entry>
</row>
+ <row>
+ <entry><structfield>good_plan</structfield></entry>
+ <entry><type>text</type></entry>
+ <entry></entry>
+ <entry>The execution plan of the statement with the shortest execution time, in milliseconds</entry>
+ </row>
+
+ <row>
+ <entry><structfield>bad_plan</structfield></entry>
+ <entry><type>text</type></entry>
+ <entry></entry>
+ <entry>The execution plan of the statement with the longest execution time, in milliseconds</entry>
+ </row>
+
<row>
<entry><structfield>calls</structfield></entry>
<entry><type>bigint</type></entry>
@@ -124,6 +138,34 @@
<entry>Population standard deviation of time spent in the statement, in milliseconds</entry>
</row>
+ <row>
+ <entry><structfield>good_plan_time</structfield></entry>
+ <entry><type>double precision</type></entry>
+ <entry></entry>
+ <entry>The execution time belonging to the good_plan, in milliseconds</entry>
+ </row>
+
+ <row>
+ <entry><structfield>good_plan_timestamp</structfield></entry>
+ <entry><type>timestamp</type></entry>
+ <entry></entry>
+ <entry>Timestamp (with timezone) belonging to the good_plan</entry>
+ </row>
+
+ <row>
+ <entry><structfield>bad_plan_time</structfield></entry>
+ <entry><type>double precision</type></entry>
+ <entry></entry>
+ <entry>The execution time belonging to the bad_plan, in milliseconds</entry>
+ </row>
+
+ <row>
+ <entry><structfield>bad_plan_timestamp</structfield></entry>
+ <entry><type>timestamp</type></entry>
+ <entry></entry>
+ <entry>Timestamp (with timezone) belonging to the bad_plan</entry>
+ </row>
+
<row>
<entry><structfield>rows</structfield></entry>
<entry><type>bigint</type></entry>
@@ -339,6 +381,40 @@
<varlistentry>
<term>
+ <function>pg_stat_statements_good_plan_reset(queryid bigint) returns void</function>
+ <indexterm>
+ <primary>pg_stat_statements_good_plan_reset</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ resets <parameter>good_plan, good_plan_time</parameter> and <parameter>
+ good_plan_timestamp</parameter> of the entry matching
+ <parameter>queryid</parameter>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>pg_stat_statements_bad_plan_reset(queryid bigint) returns void</function>
+ <indexterm>
+ <primary>pg_stat_statements_bad_plan_reset</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ resets <parameter>bad_plan, bad_plan_time</parameter> and <parameter>
+ bad_plan_timestamp</parameter> of the entry matching
+ <parameter>queryid</parameter>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
<function>pg_stat_statements(showtext boolean) returns setof record</function>
<indexterm>
<primary>pg_stat_statements</primary>
@@ -444,6 +520,96 @@
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.good_plan_enable</varname> (<type>boolean</type>)
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_stat_statements.good_plan_enable</varname> specifies if good plans will be saved.
+ The default value is <literal>on</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.bad_plan_enable</varname> (<type>boolean</type>)
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_stat_statements.bad_plan_enable</varname> specifies if bad plans will be saved.
+ The default value is <literal>on</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.good_plan_delta</varname> (<type>real</type>)
+ </term>
+
+ <listitem>
+ <para>
+ If the execution time deceeds the previous good_plan_time by a factor smaller than <varname>pg_stat_statements.good_plan_delta</varname>, the current plan will be stored.
+ The default value is <literal>0.9</>. Must be within <literal>0.0</> (never save new plans) and <literal>1.0</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.bad_plan_delta</varname> (<type>real</type>)
+ </term>
+
+ <listitem>
+ <para>
+ If the execution time exceeds the previous bad_plan_time by a factor larger than <varname>pg_stat_statements.bad_plan_delta</varname>, the current plan will be stored.
+ The default value is <literal>1.1</>. Must be within <literal>1.01</>and <literal>10.0</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.plan_min_interval</varname> (<type>integer</type>)
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_stat_statements.plan_min_interval</varname> is the amount of seconds that must pass before a newer plan can be stored.
+ The default value is <literal>5</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.plan_format</varname> (<type>enum</type>)
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_stat_statements.bad_plan_delta</varname> specifies the format which is used when saving plans.
+ The default value is <literal>text</>. Can be <literal>xml</>, <literal>json</> and <literal>yaml</>.
+ This parameter can only be set in the <filename>postgresql.conf</>
+ file or on the server command line.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
<para>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers