On Mon, Mar 23, 2020 at 11:24:50PM +0900, Fujii Masao wrote:
>
> > Here are the comments for 0001 patch.
> >
> > + /*
> > + * Report a full page image constructed for the WAL record
> > + */
> > + pgWalUsage.wal_fp_records++;
> >
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.
Agreed, I went with fpw.
> > ISTM that this counter could be incorrect if XLogInsertRecord() determines
> > to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?
Yes probably. I also see while adding support for EXPLAIN/auto_explain that
the previous approach was incrementing both records and fpw_records, while it
should be only one of those for each record. I fixed this using the approach I
previously mentionned in [1] which seems to work just fine.
> > + long wal_bytes; /* size of wal records produced */
> >
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?
Yes indeed. I switched to uint64, and modified everything accordingly (and
changed pgss to output numeric as there's no other way to handle unsigned int8)
> > + shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> >
> > bufusage_space should be walusage_space here?
Good catch, fixed.
> > /*
> > * Finish parallel execution. We wait for parallel workers to finish, and
> > * accumulate their buffer usage.
> > */
> >
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.
I went through all the file and quickly checked in other places, and I think I
fixed all required comments.
> Here are the comments for 0002 patch.
>
> + OUT wal_write_bytes int8,
> + OUT wal_write_records int8,
> + OUT wal_write_fp_records int8
>
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?
Agreed, I simply dropped the "_write" part everywhere.
> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
>
> PARALLEL SAFE should be specified?
Indeed, fixed.
> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
>
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?
As mentionned in other pgss thread, I think the general agreement is to never
provide full script anymore, so I didn't changed that.
> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
>
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...
It should ensure a FPW for each new block touch, but yes that's quite fragile.
Since I fixed the record / FPW record counters, I saw that this was actually
already broken as there was a mix of FPW and non-FPW, so I dropped the
checkpoint and just tested (wal_record + wal_fpw_record) instead.
> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
>
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?
As far as I can see it's used to test multiple scenario (single command /
multiple commands in or outside explicit transaction). It shouldn't add a lot
of overhead and since some commands are issues with "\;" it's also testing
proper query string isolation when multi-command query string is provided,
which doesn't seem like a bad idea. I didn't changed that but I'm not opposed
to remove some of the updates if needed.
Also, to answer Amit Kapila's comments about WAL records and parallel query, I
added support for both EXPLAIN and auto_explain (tab completion and
documentation are also updated), and using a simple table with an index, with
forced parallelism and no leader participation and concurrent update on the
same table, I could test that WAL usage is working as expected:
rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=8.695..47.592
rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
WAL: records=204 bytes=86198
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14)
(actual time=0.056..29.112 rows=50005 loops
Output: id, val
WAL: records=204 bytes=86198
Worker 0: actual time=0.060..28.995 rows=49593 loops=1
WAL: records=105 bytes=44222
Worker 1: actual time=0.052..29.230 rows=50417 loops=1
WAL: records=99 bytes=41976
Planning Time: 0.038 ms
Execution Time: 53.957 ms
(14 rows)
and the same query when nothing end up being modified:
rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=9.413..48.187
rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14)
(actual time=0.033..24.697 rows=50005 loops
Output: id, val
Worker 0: actual time=0.028..24.786 rows=50447 loops=1
Worker 1: actual time=0.038..24.609 rows=49563 loops=1
Planning Time: 0.282 ms
Execution Time: 55.643 ms
(10 rows)
So it seems to me that WAL usage infrastructure for parallel query is working
just fine. I added the EXPLAIN/auto_explain in a separate commit just in case.
[1]
https://www.postgresql.org/message-id/CAOBaU_aECK1Z7Nn+x=mhvewrjzk8wypsptwaafjqtzn1fyj...@mail.gmail.com
>From 9e90e1a41cb1cbeaa99528a9b7be75f9bf9294c4 Mon Sep 17 00:00:00 2001
From: Kirill Bychik <[email protected]>
Date: Tue, 17 Mar 2020 14:41:50 +0100
Subject: [PATCH v7 1/4] Add infrastructure to track WAL usage.
Author: Kirill Bychik, Julien Rouhaud
Reviewed-by: Fuji Masao
Discussion:
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
src/backend/access/transam/xlog.c | 11 ++++++
src/backend/access/transam/xloginsert.c | 1 +
src/backend/executor/execParallel.c | 38 ++++++++++++++-----
src/backend/executor/instrument.c | 50 ++++++++++++++++++++++---
src/include/executor/execParallel.h | 1 +
src/include/executor/instrument.h | 19 +++++++++-
src/tools/pgindent/typedefs.list | 1 +
7 files changed, 103 insertions(+), 18 deletions(-)
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 8fe92962b0..f270b4a0e5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -42,6 +42,7 @@
#include "commands/progress.h"
#include "commands/tablespace.h"
#include "common/controldata_utils.h"
+#include "executor/instrument.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "pgstat.h"
@@ -1249,6 +1250,16 @@ XLogInsertRecord(XLogRecData *rdata,
ProcLastRecPtr = StartPos;
XactLastRecEnd = EndPos;
+ /* Provide WAL update data to the instrumentation */
+ if (inserted)
+ {
+ pgWalUsage.wal_bytes += rechdr->xl_tot_len;
+ if (doPageWrites && fpw_lsn <= RedoRecPtr)
+ pgWalUsage.wal_fpw_records++;
+ else
+ pgWalUsage.wal_records++;
+ }
+
return EndPos;
}
diff --git a/src/backend/access/transam/xloginsert.c
b/src/backend/access/transam/xloginsert.c
index a618dec776..bb2290d636 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -25,6 +25,7 @@
#include "access/xloginsert.h"
#include "catalog/pg_control.h"
#include "common/pg_lzcompress.h"
+#include "executor/instrument.h"
#include "miscadmin.h"
#include "pg_trace.h"
#include "replication/origin.h"
diff --git a/src/backend/executor/execParallel.c
b/src/backend/executor/execParallel.c
index a753d6efa0..7d9ca66fc8 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -12,7 +12,7 @@
* workers and ensuring that their state generally matches that of the
* leader; see src/backend/access/transam/README.parallel for details.
* However, we must save and restore relevant executor state, such as
- * any ParamListInfo associated with the query, buffer usage info, and
+ * any ParamListInfo associated with the query, buffer/WAL usage info, and
* the actual plan to be passed down to the worker.
*
* IDENTIFICATION
@@ -62,6 +62,7 @@
#define PARALLEL_KEY_DSA
UINT64CONST(0xE000000000000007)
#define PARALLEL_KEY_QUERY_TEXT UINT64CONST(0xE000000000000008)
#define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
+#define PARALLEL_KEY_WAL_USAGE UINT64CONST(0xE00000000000000A)
#define PARALLEL_TUPLE_QUEUE_SIZE 65536
@@ -573,6 +574,7 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
char *pstmt_space;
char *paramlistinfo_space;
BufferUsage *bufusage_space;
+ WalUsage *walusage_space;
SharedExecutorInstrumentation *instrumentation = NULL;
SharedJitInstrumentation *jit_instrumentation = NULL;
int pstmt_len;
@@ -646,6 +648,13 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
mul_size(sizeof(BufferUsage), pcxt->nworkers));
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ /*
+ * Same thing for WalUsage.
+ */
+ shm_toc_estimate_chunk(&pcxt->estimator,
+ mul_size(sizeof(WalUsage),
pcxt->nworkers));
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate space for tuple queues. */
shm_toc_estimate_chunk(&pcxt->estimator,
mul_size(PARALLEL_TUPLE_QUEUE_SIZE, pcxt->nworkers));
@@ -728,6 +737,12 @@ ExecInitParallelPlan(PlanState *planstate, EState *estate,
shm_toc_insert(pcxt->toc, PARALLEL_KEY_BUFFER_USAGE, bufusage_space);
pei->buffer_usage = bufusage_space;
+ /* Same for WalUsage. */
+ walusage_space = shm_toc_allocate(pcxt->toc,
+
mul_size(sizeof(WalUsage), pcxt->nworkers));
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, walusage_space);
+ pei->wal_usage = walusage_space;
+
/* Set up the tuple queues that the workers will write into. */
pei->tqueue = ExecParallelSetupTupleQueues(pcxt, false);
@@ -1069,7 +1084,7 @@ ExecParallelRetrieveJitInstrumentation(PlanState
*planstate,
/*
* Finish parallel execution. We wait for parallel workers to finish, and
- * accumulate their buffer usage.
+ * accumulate their buffer/WAL usage.
*/
void
ExecParallelFinish(ParallelExecutorInfo *pei)
@@ -1109,11 +1124,11 @@ ExecParallelFinish(ParallelExecutorInfo *pei)
WaitForParallelWorkersToFinish(pei->pcxt);
/*
- * Next, accumulate buffer usage. (This must wait for the workers to
- * finish, or we might get incomplete data.)
+ * Next, accumulate buffer/WAL usage. (This must wait for the workers
+ * to finish, or we might get incomplete data.)
*/
for (i = 0; i < nworkers; i++)
- InstrAccumParallelQuery(&pei->buffer_usage[i]);
+ InstrAccumParallelQuery(&pei->buffer_usage[i],
&pei->wal_usage[i]);
pei->finished = true;
}
@@ -1333,6 +1348,7 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
{
FixedParallelExecutorState *fpes;
BufferUsage *buffer_usage;
+ WalUsage *wal_usage;
DestReceiver *receiver;
QueryDesc *queryDesc;
SharedExecutorInstrumentation *instrumentation;
@@ -1386,11 +1402,11 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
ExecSetTupleBound(fpes->tuples_needed, queryDesc->planstate);
/*
- * Prepare to track buffer usage during query execution.
+ * Prepare to track buffer/WAL usage during query execution.
*
* We do this after starting up the executor to match what happens in
the
- * leader, which also doesn't count buffer accesses that occur during
- * executor startup.
+ * leader, which also doesn't count buffer accesses and WAL activity
that
+ * occur during executor startup.
*/
InstrStartParallelQuery();
@@ -1406,9 +1422,11 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
/* Shut down the executor */
ExecutorFinish(queryDesc);
- /* Report buffer usage during parallel execution. */
+ /* Report buffer/WAL usage during parallel execution. */
buffer_usage = shm_toc_lookup(toc, PARALLEL_KEY_BUFFER_USAGE, false);
- InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber]);
+ wal_usage = shm_toc_lookup (toc, PARALLEL_KEY_WAL_USAGE, false);
+ InstrEndParallelQuery(&buffer_usage[ParallelWorkerNumber],
+
&wal_usage[ParallelWorkerNumber]);
/* Report instrumentation data if any instrumentation options are set.
*/
if (instrumentation != NULL)
diff --git a/src/backend/executor/instrument.c
b/src/backend/executor/instrument.c
index bc1d42bf64..d2515b0a4c 100644
--- a/src/backend/executor/instrument.c
+++ b/src/backend/executor/instrument.c
@@ -24,6 +24,10 @@ static void BufferUsageAdd(BufferUsage *dst, const
BufferUsage *add);
static void BufferUsageAccumDiff(BufferUsage *dst,
const
BufferUsage *add, const BufferUsage *sub);
+WalUsage pgWalUsage;
+static WalUsage save_pgWalUsage;
+
+static void WalUsageAdd(WalUsage *dst, WalUsage *add);
/* Allocate new instrumentation structure(s) */
Instrumentation *
@@ -33,15 +37,17 @@ InstrAlloc(int n, int instrument_options)
/* initialize all fields to zeroes, then modify as needed */
instr = palloc0(n * sizeof(Instrumentation));
- if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER))
+ if (instrument_options & (INSTRUMENT_BUFFERS | INSTRUMENT_TIMER |
INSTRUMENT_WAL))
{
bool need_buffers = (instrument_options &
INSTRUMENT_BUFFERS) != 0;
+ bool need_wal = (instrument_options &
INSTRUMENT_WAL) != 0;
bool need_timer = (instrument_options &
INSTRUMENT_TIMER) != 0;
int i;
for (i = 0; i < n; i++)
{
instr[i].need_bufusage = need_buffers;
+ instr[i].need_walusage = need_wal;
instr[i].need_timer = need_timer;
}
}
@@ -55,6 +61,7 @@ InstrInit(Instrumentation *instr, int instrument_options)
{
memset(instr, 0, sizeof(Instrumentation));
instr->need_bufusage = (instrument_options & INSTRUMENT_BUFFERS) != 0;
+ instr->need_walusage = (instrument_options & INSTRUMENT_WAL) != 0;
instr->need_timer = (instrument_options & INSTRUMENT_TIMER) != 0;
}
@@ -69,6 +76,9 @@ InstrStartNode(Instrumentation *instr)
/* save buffer usage totals at node entry, if needed */
if (instr->need_bufusage)
instr->bufusage_start = pgBufferUsage;
+
+ if (instr->need_walusage)
+ instr->walusage_start = pgWalUsage;
}
/* Exit from a plan node */
@@ -97,6 +107,10 @@ InstrStopNode(Instrumentation *instr, double nTuples)
BufferUsageAccumDiff(&instr->bufusage,
&pgBufferUsage,
&instr->bufusage_start);
+ if (instr->need_walusage)
+ WalUsageAccumDiff(&instr->walusage,
+ &pgWalUsage,
&instr->walusage_start);
+
/* Is this the first tuple of this cycle? */
if (!instr->running)
{
@@ -160,6 +174,9 @@ InstrAggNode(Instrumentation *dst, Instrumentation *add)
/* Add delta of buffer usage since entry to node's totals */
if (dst->need_bufusage)
BufferUsageAdd(&dst->bufusage, &add->bufusage);
+
+ if (dst->need_walusage)
+ WalUsageAdd(&dst->walusage, &add->walusage);
}
/* note current values during parallel executor startup */
@@ -167,21 +184,25 @@ void
InstrStartParallelQuery(void)
{
save_pgBufferUsage = pgBufferUsage;
+ save_pgWalUsage = pgWalUsage;
}
/* report usage after parallel executor shutdown */
void
-InstrEndParallelQuery(BufferUsage *result)
+InstrEndParallelQuery(BufferUsage *bufusage, WalUsage *walusage)
{
- memset(result, 0, sizeof(BufferUsage));
- BufferUsageAccumDiff(result, &pgBufferUsage, &save_pgBufferUsage);
+ memset(bufusage, 0, sizeof(BufferUsage));
+ memset(walusage, 0, sizeof(WalUsage));
+ BufferUsageAccumDiff(bufusage, &pgBufferUsage, &save_pgBufferUsage);
+ WalUsageAccumDiff(walusage, &pgWalUsage, &save_pgWalUsage);
}
/* accumulate work done by workers in leader's stats */
void
-InstrAccumParallelQuery(BufferUsage *result)
+InstrAccumParallelQuery(BufferUsage *bufusage, WalUsage *walusage)
{
- BufferUsageAdd(&pgBufferUsage, result);
+ BufferUsageAdd(&pgBufferUsage, bufusage);
+ WalUsageAdd(&pgWalUsage, walusage);
}
/* dst += add */
@@ -223,3 +244,20 @@ BufferUsageAccumDiff(BufferUsage *dst,
INSTR_TIME_ACCUM_DIFF(dst->blk_write_time,
add->blk_write_time,
sub->blk_write_time);
}
+
+/* helper functions for WAL usage accumulation */
+static void
+WalUsageAdd(WalUsage *dst, WalUsage *add)
+{
+ dst->wal_bytes += add->wal_bytes;
+ dst->wal_records += add->wal_records;
+ dst->wal_fpw_records += add->wal_fpw_records;
+}
+
+void
+WalUsageAccumDiff(WalUsage *dst, const WalUsage *add, const WalUsage *sub)
+{
+ dst->wal_bytes += add->wal_bytes - sub->wal_bytes;
+ dst->wal_records += add->wal_records - sub->wal_records;
+ dst->wal_fpw_records += add->wal_fpw_records - sub->wal_fpw_records;
+}
diff --git a/src/include/executor/execParallel.h
b/src/include/executor/execParallel.h
index 17d07cf020..1cc5b524fd 100644
--- a/src/include/executor/execParallel.h
+++ b/src/include/executor/execParallel.h
@@ -26,6 +26,7 @@ typedef struct ParallelExecutorInfo
PlanState *planstate; /* plan subtree we're running in
parallel */
ParallelContext *pcxt; /* parallel context we're using */
BufferUsage *buffer_usage; /* points to bufusage area in DSM */
+ WalUsage *wal_usage; /* walusage area in DSM */
SharedExecutorInstrumentation *instrumentation; /* optional */
struct SharedJitInstrumentation *jit_instrumentation; /* optional */
dsa_area *area; /* points to DSA area in DSM */
diff --git a/src/include/executor/instrument.h
b/src/include/executor/instrument.h
index f48d46aede..23264dd396 100644
--- a/src/include/executor/instrument.h
+++ b/src/include/executor/instrument.h
@@ -32,12 +32,21 @@ typedef struct BufferUsage
instr_time blk_write_time; /* time spent writing */
} BufferUsage;
+typedef struct WalUsage
+{
+ long wal_records; /* # of WAL records produced */
+ long wal_fpw_records; /* # of full page write WAL
records
+ *
produced */
+ uint64 wal_bytes; /* size of WAL records produced
*/
+} WalUsage;
+
/* Flag bits included in InstrAlloc's instrument_options bitmask */
typedef enum InstrumentOption
{
INSTRUMENT_TIMER = 1 << 0, /* needs timer (and row counts) */
INSTRUMENT_BUFFERS = 1 << 1, /* needs buffer usage */
INSTRUMENT_ROWS = 1 << 2, /* needs row count */
+ INSTRUMENT_WAL = 1 << 3, /* needs WAL usage */
INSTRUMENT_ALL = PG_INT32_MAX
} InstrumentOption;
@@ -46,6 +55,7 @@ typedef struct Instrumentation
/* Parameters set at node creation: */
bool need_timer; /* true if we need timer data */
bool need_bufusage; /* true if we need buffer usage data */
+ bool need_walusage; /* true if we need WAL usage data */
/* Info about current plan cycle: */
bool running; /* true if we've completed
first tuple */
instr_time starttime; /* start time of current
iteration of node */
@@ -53,6 +63,7 @@ typedef struct Instrumentation
double firsttuple; /* time for first tuple of this
cycle */
double tuplecount; /* # of tuples emitted so far
this cycle */
BufferUsage bufusage_start; /* buffer usage at start */
+ WalUsage walusage_start; /* WAL usage at start */
/* Accumulated statistics across all completed cycles: */
double startup; /* total startup time (in
seconds) */
double total; /* total time (in seconds) */
@@ -62,6 +73,7 @@ typedef struct Instrumentation
double nfiltered1; /* # of tuples removed by
scanqual or joinqual */
double nfiltered2; /* # of tuples removed by
"other" quals */
BufferUsage bufusage; /* total buffer usage */
+ WalUsage walusage; /* total WAL usage */
} Instrumentation;
typedef struct WorkerInstrumentation
@@ -71,6 +83,7 @@ typedef struct WorkerInstrumentation
} WorkerInstrumentation;
extern PGDLLIMPORT BufferUsage pgBufferUsage;
+extern PGDLLIMPORT WalUsage pgWalUsage;
extern Instrumentation *InstrAlloc(int n, int instrument_options);
extern void InstrInit(Instrumentation *instr, int instrument_options);
@@ -79,7 +92,9 @@ extern void InstrStopNode(Instrumentation *instr, double
nTuples);
extern void InstrEndLoop(Instrumentation *instr);
extern void InstrAggNode(Instrumentation *dst, Instrumentation *add);
extern void InstrStartParallelQuery(void);
-extern void InstrEndParallelQuery(BufferUsage *result);
-extern void InstrAccumParallelQuery(BufferUsage *result);
+extern void InstrEndParallelQuery(BufferUsage *bufusage, WalUsage *walusage);
+extern void InstrAccumParallelQuery(BufferUsage *bufusage, WalUsage *walusage);
+extern void WalUsageAccumDiff(WalUsage *dst, const WalUsage *add,
+ const WalUsage *sub);
#endif /* INSTRUMENT_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ca2d9ec8fb..1eef542f06 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2635,6 +2635,7 @@ WalSndCtlData
WalSndSendDataCallback
WalSndState
WalTimeSample
+WalUsage
WalWriteMethod
Walfile
WindowAgg
--
2.20.1
>From 20be7cc55954f1e5095fca7143a61ae6b036c41e Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <[email protected]>
Date: Sun, 29 Mar 2020 12:38:14 +0200
Subject: [PATCH v7 2/4] Add option to report WAL usage in EXPLAIN and
auto_explain.
Author: Julien Rouhaud
Reviewed-by:
Discussion:
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
contrib/auto_explain/auto_explain.c | 15 ++++++
doc/src/sgml/auto-explain.sgml | 20 ++++++++
doc/src/sgml/ref/explain.sgml | 14 ++++++
src/backend/commands/explain.c | 74 +++++++++++++++++++++++++++--
src/bin/psql/tab-complete.c | 4 +-
src/include/commands/explain.h | 3 ++
6 files changed, 124 insertions(+), 6 deletions(-)
diff --git a/contrib/auto_explain/auto_explain.c
b/contrib/auto_explain/auto_explain.c
index f69dde876c..56c549d84c 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -27,6 +27,7 @@ static int auto_explain_log_min_duration = -1; /* msec or
-1 */
static bool auto_explain_log_analyze = false;
static bool auto_explain_log_verbose = false;
static bool auto_explain_log_buffers = false;
+static bool auto_explain_log_wal = false;
static bool auto_explain_log_triggers = false;
static bool auto_explain_log_timing = true;
static bool auto_explain_log_settings = false;
@@ -148,6 +149,17 @@ _PG_init(void)
NULL,
NULL);
+ DefineCustomBoolVariable("auto_explain.log_wal",
+ "Log WAL usage.",
+ NULL,
+ &auto_explain_log_wal,
+ false,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
DefineCustomBoolVariable("auto_explain.log_triggers",
"Include trigger
statistics in plans.",
"This has no effect
unless log_analyze is also set.",
@@ -280,6 +292,8 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
queryDesc->instrument_options |=
INSTRUMENT_ROWS;
if (auto_explain_log_buffers)
queryDesc->instrument_options |=
INSTRUMENT_BUFFERS;
+ if (auto_explain_log_wal)
+ queryDesc->instrument_options |= INSTRUMENT_WAL;
}
}
@@ -374,6 +388,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
es->analyze = (queryDesc->instrument_options &&
auto_explain_log_analyze);
es->verbose = auto_explain_log_verbose;
es->buffers = (es->analyze && auto_explain_log_buffers);
+ es->wal = (es->analyze && auto_explain_log_wal);
es->timing = (es->analyze && auto_explain_log_timing);
es->summary = es->analyze;
es->format = auto_explain_log_format;
diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index 3d619d4a3d..d4d29c4a64 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -109,6 +109,26 @@ LOAD 'auto_explain';
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <varname>auto_explain.log_wal</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>auto_explain.log_wal</varname> configuration
parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ <varname>auto_explain.log_wal</varname> controls whether WAL
+ usage statistics are printed when an execution plan is logged; it's
+ equivalent to the <literal>WAL</literal> option of
<command>EXPLAIN</command>.
+ This parameter has no effect
+ unless <varname>auto_explain.log_analyze</varname> is enabled.
+ This parameter is off by default.
+ Only superusers can change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term>
<varname>auto_explain.log_timing</varname> (<type>boolean</type>)
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index 385d10411f..e4661232b2 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -41,6 +41,7 @@ EXPLAIN [ ANALYZE ] [ VERBOSE ] <replaceable
class="parameter">statement</replac
COSTS [ <replaceable class="parameter">boolean</replaceable> ]
SETTINGS [ <replaceable class="parameter">boolean</replaceable> ]
BUFFERS [ <replaceable class="parameter">boolean</replaceable> ]
+ WAL [ <replaceable class="parameter">boolean</replaceable> ]
TIMING [ <replaceable class="parameter">boolean</replaceable> ]
SUMMARY [ <replaceable class="parameter">boolean</replaceable> ]
FORMAT { TEXT | XML | JSON | YAML }
@@ -192,6 +193,19 @@ ROLLBACK;
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><literal>WAL</literal></term>
+ <listitem>
+ <para>
+ Include information on WAL record generation. Specifically, include the
+ number of records, full page records and bytes generated. In text
+ format, only non-zero values are printed. This parameter may only be
+ used when <literal>ANALYZE</literal> is also enabled. It defaults to
+ <literal>FALSE</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><literal>TIMING</literal></term>
<listitem>
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index ff2f45cfb2..f9d250a932 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -113,6 +113,7 @@ static void show_foreignscan_info(ForeignScanState
*fsstate, ExplainState *es);
static void show_eval_params(Bitmapset *bms_params, ExplainState *es);
static const char *explain_get_index_name(Oid indexId);
static void show_buffer_usage(ExplainState *es, const BufferUsage *usage);
+static void show_wal_usage(ExplainState *es, const WalUsage *usage);
static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
ExplainState *es);
static void ExplainScanTarget(Scan *plan, ExplainState *es);
@@ -175,6 +176,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
es->costs = defGetBoolean(opt);
else if (strcmp(opt->defname, "buffers") == 0)
es->buffers = defGetBoolean(opt);
+ else if (strcmp(opt->defname, "wal") == 0)
+ es->wal = defGetBoolean(opt);
else if (strcmp(opt->defname, "settings") == 0)
es->settings = defGetBoolean(opt);
else if (strcmp(opt->defname, "timing") == 0)
@@ -219,6 +222,11 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("EXPLAIN option BUFFERS requires
ANALYZE")));
+ if (es->wal && !es->analyze)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("EXPLAIN option WAL requires
ANALYZE")));
+
/* if the timing was not set explicitly, set default value */
es->timing = (timing_set) ? es->timing : es->analyze;
@@ -494,6 +502,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
ExplainState *es,
if (es->buffers)
instrument_option |= INSTRUMENT_BUFFERS;
+ if (es->wal)
+ instrument_option |= INSTRUMENT_WAL;
/*
* We always collect timing for the entire statement, even when
node-level
@@ -1942,12 +1952,14 @@ ExplainNode(PlanState *planstate, List *ancestors,
}
}
- /* Show buffer usage */
+ /* Show buffer/WAL usage */
if (es->buffers && planstate->instrument)
show_buffer_usage(es, &planstate->instrument->bufusage);
+ if (es->wal && planstate->instrument)
+ show_wal_usage(es, &planstate->instrument->walusage);
- /* Prepare per-worker buffer usage */
- if (es->workers_state && es->buffers && es->verbose)
+ /* Prepare per-worker buffer/WAL usage */
+ if (es->workers_state && (es->buffers || es->wal) && es->verbose)
{
WorkerInstrumentation *w = planstate->worker_instrument;
@@ -1960,7 +1972,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
continue;
ExplainOpenWorker(n, es);
- show_buffer_usage(es, &instrument->bufusage);
+ if (es->buffers)
+ show_buffer_usage(es, &instrument->bufusage);
+ if (es->wal)
+ show_wal_usage(es, &instrument->walusage);
ExplainCloseWorker(n, es);
}
}
@@ -3059,6 +3074,44 @@ show_buffer_usage(ExplainState *es, const BufferUsage
*usage)
}
}
+/*
+ * Show WAL usage details.
+ */
+static void
+show_wal_usage(ExplainState *es, const WalUsage *usage)
+{
+ if (es->format == EXPLAIN_FORMAT_TEXT)
+ {
+ /* Show only positive counter values. */
+ if ((usage->wal_records > 0) || (usage->wal_fpw_records > 0) ||
+ (usage->wal_bytes > 0))
+ {
+ ExplainIndentText(es);
+ appendStringInfoString(es->str, "WAL:");
+
+ if (usage->wal_records > 0)
+ appendStringInfo(es->str, " records=%ld",
+
usage->wal_records);
+ if (usage->wal_fpw_records > 0)
+ appendStringInfo(es->str, " full page
records=%ld",
+
usage->wal_fpw_records);
+ if (usage->wal_bytes > 0)
+ appendStringInfo(es->str, " bytes="
UINT64_FORMAT,
+
usage->wal_bytes);
+ appendStringInfoChar(es->str, '\n');
+ }
+ }
+ else
+ {
+ ExplainPropertyInteger("WAL records", NULL,
+ usage->wal_records,
es);
+ ExplainPropertyInteger("WAL full page records", NULL,
+
usage->wal_fpw_records, es);
+ ExplainPropertyUInteger("WAL bytes", NULL,
+ usage->wal_bytes,
es);
+ }
+}
+
/*
* Add some additional details about an IndexScan or IndexOnlyScan
*/
@@ -3843,6 +3896,19 @@ ExplainPropertyInteger(const char *qlabel, const char
*unit, int64 value,
ExplainProperty(qlabel, unit, buf, true, es);
}
+/*
+ * Explain an unsigned integer-valued property.
+ */
+void
+ExplainPropertyUInteger(const char *qlabel, const char *unit, uint64 value,
+ ExplainState *es)
+{
+ char buf[32];
+
+ snprintf(buf, sizeof(buf), UINT64_FORMAT, value);
+ ExplainProperty(qlabel, unit, buf, true, es);
+}
+
/*
* Explain a float-valued property, using the specified number of
* fractional digits.
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ca8f0d75a6..fa61284248 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3045,8 +3045,8 @@ psql_completion(const char *text, int start, int end)
*/
if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
COMPLETE_WITH("ANALYZE", "VERBOSE", "COSTS", "SETTINGS",
- "BUFFERS", "TIMING",
"SUMMARY", "FORMAT");
- else if
(TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|BUFFERS|TIMING|SUMMARY"))
+ "BUFFERS", "WAL", "TIMING",
"SUMMARY", "FORMAT");
+ else if
(TailMatches("ANALYZE|VERBOSE|COSTS|SETTINGS|BUFFERS|WAL|TIMING|SUMMARY"))
COMPLETE_WITH("ON", "OFF");
else if (TailMatches("FORMAT"))
COMPLETE_WITH("TEXT", "XML", "JSON", "YAML");
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 54f6240e5e..7b0b0a94a6 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -42,6 +42,7 @@ typedef struct ExplainState
bool analyze; /* print actual times */
bool costs; /* print estimated costs */
bool buffers; /* print buffer usage */
+ bool wal; /* print WAL usage */
bool timing; /* print detailed node timing */
bool summary; /* print total planning and
execution timing */
bool settings; /* print modified settings */
@@ -110,6 +111,8 @@ extern void ExplainPropertyText(const char *qlabel, const
char *value,
ExplainState
*es);
extern void ExplainPropertyInteger(const char *qlabel, const char *unit,
int64 value,
ExplainState *es);
+extern void ExplainPropertyUInteger(const char *qlabel, const char *unit,
+ uint64
value, ExplainState *es);
extern void ExplainPropertyFloat(const char *qlabel, const char *unit,
double value,
int ndigits, ExplainState *es);
extern void ExplainPropertyBool(const char *qlabel, bool value,
--
2.20.1
>From 8dd26b0e0bb485f2b8f1168716903e556aa23bf4 Mon Sep 17 00:00:00 2001
From: Kirill Bychik <[email protected]>
Date: Tue, 17 Mar 2020 14:42:02 +0100
Subject: [PATCH v7 3/4] Keep track of WAL usage in pg_stat_statements.
Author: Kirill Bychik
Reviewed-by: Julien Rouhaud, Fuji Masao
Discussion:
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
contrib/pg_stat_statements/Makefile | 3 +-
.../expected/pg_stat_statements.out | 144 +++++++++++++++---
.../pg_stat_statements--1.7--1.8.sql | 50 ++++++
.../pg_stat_statements/pg_stat_statements.c | 58 ++++++-
.../pg_stat_statements.control | 2 +-
.../sql/pg_stat_statements.sql | 64 +++++++-
doc/src/sgml/pgstatstatements.sgml | 27 ++++
7 files changed, 322 insertions(+), 26 deletions(-)
create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
diff --git a/contrib/pg_stat_statements/Makefile
b/contrib/pg_stat_statements/Makefile
index 80042a0905..081f997d70 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -6,7 +6,8 @@ OBJS = \
pg_stat_statements.o
EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.6--1.7.sql \
+DATA = pg_stat_statements--1.4.sql \
+ pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \
pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out
b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 6787ec1efd..7bfbeffa21 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -195,20 +195,126 @@ SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
3 | c
(8 rows)
-SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
- query | calls | rows
--------------------------------------------------------------+-------+------
- DELETE FROM test WHERE a > $1 | 1 | 1
- INSERT INTO test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) | 1 | 3
- INSERT INTO test VALUES(generate_series($1, $2), $3) | 1 | 10
- SELECT * FROM test ORDER BY a | 1 | 12
- SELECT * FROM test WHERE a > $1 ORDER BY a | 2 | 4
- SELECT * FROM test WHERE a IN ($1, $2, $3, $4, $5) | 1 | 8
- SELECT pg_stat_statements_reset() | 1 | 1
- UPDATE test SET b = $1 WHERE a = $2 | 6 | 6
- UPDATE test SET b = $1 WHERE a > $2 | 1 | 3
+-- SELECT check WAL usage stats to confirm temp tables do not get stored in WAL
+SELECT query, calls, rows, wal_bytes, wal_records, wal_fpw_records
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls | rows |
wal_bytes | wal_records | wal_fpw_records
+-------------------------------------------------------------+-------+------+-----------+-------------+-----------------
+ DELETE FROM test WHERE a > $1 | 1 | 1 |
0 | 0 | 0
+ INSERT INTO test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) | 1 | 3 |
0 | 0 | 0
+ INSERT INTO test VALUES(generate_series($1, $2), $3) | 1 | 10 |
0 | 0 | 0
+ SELECT * FROM test ORDER BY a | 1 | 12 |
0 | 0 | 0
+ SELECT * FROM test WHERE a > $1 ORDER BY a | 2 | 4 |
0 | 0 | 0
+ SELECT * FROM test WHERE a IN ($1, $2, $3, $4, $5) | 1 | 8 |
0 | 0 | 0
+ SELECT pg_stat_statements_reset() | 1 | 1 |
0 | 0 | 0
+ UPDATE test SET b = $1 WHERE a = $2 | 6 | 6 |
0 | 0 | 0
+ UPDATE test SET b = $1 WHERE a > $2 | 1 | 3 |
0 | 0 | 0
(9 rows)
+--
+-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to validate WAL
generation metrics
+--
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+-- utility "create table" should not be shown
+CREATE TABLE pgss_test (a int, b char(20));
+INSERT INTO pgss_test VALUES(generate_series(1, 10), 'aaa');
+UPDATE pgss_test SET b = 'bbb' WHERE a > 7;
+DELETE FROM pgss_test WHERE a > 9;
+-- explicit transaction
+BEGIN;
+UPDATE pgss_test SET b = '111' WHERE a = 1 ;
+COMMIT;
+BEGIN \;
+UPDATE pgss_test SET b = '222' WHERE a = 2 \;
+COMMIT ;
+UPDATE pgss_test SET b = '333' WHERE a = 3 \;
+UPDATE pgss_test SET b = '444' WHERE a = 4 ;
+BEGIN \;
+UPDATE pgss_test SET b = '555' WHERE a = 5 \;
+UPDATE pgss_test SET b = '666' WHERE a = 6 \;
+COMMIT ;
+-- many INSERT values
+INSERT INTO pgss_test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
+-- SELECT with constants
+SELECT * FROM pgss_test WHERE a > 5 ORDER BY a ;
+ a | b
+---+----------------------
+ 6 | 666
+ 7 | aaa
+ 8 | bbb
+ 9 | bbb
+(4 rows)
+
+SELECT *
+ FROM pgss_test
+ WHERE a > 9
+ ORDER BY a ;
+ a | b
+---+---
+(0 rows)
+
+-- SELECT without constants
+SELECT * FROM pgss_test ORDER BY a;
+ a | b
+---+----------------------
+ 1 | a
+ 1 | 111
+ 2 | b
+ 2 | 222
+ 3 | c
+ 3 | 333
+ 4 | 444
+ 5 | 555
+ 6 | 666
+ 7 | aaa
+ 8 | bbb
+ 9 | bbb
+(12 rows)
+
+-- SELECT with IN clause
+SELECT * FROM pgss_test WHERE a IN (1, 2, 3, 4, 5);
+ a | b
+---+----------------------
+ 1 | 111
+ 2 | 222
+ 3 | 333
+ 4 | 444
+ 5 | 555
+ 1 | a
+ 2 | b
+ 3 | c
+(8 rows)
+
+-- DROP test table
+SET pg_stat_statements.track_utility = TRUE;
+DROP TABLE pgss_test;
+SET pg_stat_statements.track_utility = FALSE;
+-- SELECT usage data, check WAL usage is reported, wal_records equal rows
count for INSERT/UPDATE/DELETE
+SELECT query, calls, rows,
+wal_bytes > 0 as wal_bytes_generated,
+(wal_records + wal_fpw_records) > 0 as wal_records_generated,
+(wal_records + wal_fpw_records) = rows as wal_records_as_rows
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls |
rows | wal_bytes_generated | wal_records_generated | wal_records_as_rows
+------------------------------------------------------------------+-------+------+---------------------+-----------------------+---------------------
+ DELETE FROM pgss_test WHERE a > $1 | 1 |
1 | t | t | t
+ DROP TABLE pgss_test | 1 |
0 | t | t | f
+ INSERT INTO pgss_test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) | 1 |
3 | t | t | t
+ INSERT INTO pgss_test VALUES(generate_series($1, $2), $3) | 1 |
10 | t | t | t
+ SELECT * FROM pgss_test ORDER BY a | 1 |
12 | f | f | f
+ SELECT * FROM pgss_test WHERE a > $1 ORDER BY a | 2 |
4 | f | f | f
+ SELECT * FROM pgss_test WHERE a IN ($1, $2, $3, $4, $5) | 1 |
8 | f | f | f
+ SELECT pg_stat_statements_reset() | 1 |
1 | f | f | f
+ SET pg_stat_statements.track_utility = FALSE | 1 |
0 | f | f | t
+ UPDATE pgss_test SET b = $1 WHERE a = $2 | 6 |
6 | t | t | t
+ UPDATE pgss_test SET b = $1 WHERE a > $2 | 1 |
3 | t | t | t
+(11 rows)
+
--
-- pg_stat_statements.track = none
--
@@ -287,13 +393,13 @@ SELECT PLUS_ONE(10);
11
(1 row)
-SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
- query | calls | rows
------------------------------------+-------+------
- SELECT $1::TEXT | 1 | 1
- SELECT PLUS_ONE($1) | 2 | 2
- SELECT PLUS_TWO($1) | 2 | 2
- SELECT pg_stat_statements_reset() | 1 | 1
+SELECT query, calls, rows, wal_bytes, wal_records FROM pg_stat_statements
ORDER BY query COLLATE "C";
+ query | calls | rows | wal_bytes | wal_records
+-----------------------------------+-------+------+-----------+-------------
+ SELECT $1::TEXT | 1 | 1 | 0 | 0
+ SELECT PLUS_ONE($1) | 2 | 2 | 0 | 0
+ SELECT PLUS_TWO($1) | 2 | 2 | 0 | 0
+ SELECT pg_stat_statements_reset() | 1 | 1 | 0 | 0
(4 rows)
--
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
new file mode 100644
index 0000000000..27ac80cde0
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql
@@ -0,0 +1,50 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.8'" to load this
file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(boolean);
+
+/* Then we can drop them */
+DROP VIEW pg_stat_statements;
+DROP FUNCTION pg_stat_statements(boolean);
+
+/* Now redefine */
+CREATE FUNCTION pg_stat_statements(IN showtext boolean,
+ OUT userid oid,
+ OUT dbid oid,
+ OUT queryid bigint,
+ OUT query text,
+ OUT calls int8,
+ OUT total_time float8,
+ OUT min_time float8,
+ OUT max_time float8,
+ OUT mean_time float8,
+ OUT stddev_time float8,
+ OUT rows int8,
+ OUT shared_blks_hit int8,
+ OUT shared_blks_read int8,
+ OUT shared_blks_dirtied int8,
+ OUT shared_blks_written int8,
+ OUT local_blks_hit int8,
+ OUT local_blks_read int8,
+ OUT local_blks_dirtied int8,
+ OUT local_blks_written int8,
+ OUT temp_blks_read int8,
+ OUT temp_blks_written int8,
+ OUT blk_read_time float8,
+ OUT blk_write_time float8,
+ OUT wal_bytes numeric,
+ OUT wal_records int8,
+ OUT wal_fpw_records int8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
+LANGUAGE C STRICT VOLATILE PARALLEL SAFE;
+
+CREATE VIEW pg_stat_statements AS
+ SELECT * FROM pg_stat_statements(true);
+
+GRANT SELECT ON pg_stat_statements TO PUBLIC;
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c
b/contrib/pg_stat_statements/pg_stat_statements.c
index 20dc8c605b..584e7a54ec 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -120,7 +120,8 @@ typedef enum pgssVersion
PGSS_V1_0 = 0,
PGSS_V1_1,
PGSS_V1_2,
- PGSS_V1_3
+ PGSS_V1_3,
+ PGSS_V1_4
} pgssVersion;
/*
@@ -161,6 +162,9 @@ typedef struct Counters
int64 local_blks_written; /* # of local disk blocks written */
int64 temp_blks_read; /* # of temp blocks read */
int64 temp_blks_written; /* # of temp blocks written */
+ uint64 wal_bytes; /* total amount of wal bytes
written */
+ int64 wal_records; /* # of wal records written */
+ int64 wal_fpw_records; /* # of full page wal records written
*/
double blk_read_time; /* time spent reading, in msec */
double blk_write_time; /* time spent writing, in msec */
double usage; /* usage factor */
@@ -293,6 +297,7 @@ PG_FUNCTION_INFO_V1(pg_stat_statements_reset);
PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_7);
PG_FUNCTION_INFO_V1(pg_stat_statements_1_2);
PG_FUNCTION_INFO_V1(pg_stat_statements_1_3);
+PG_FUNCTION_INFO_V1(pg_stat_statements_1_4);
PG_FUNCTION_INFO_V1(pg_stat_statements);
static void pgss_shmem_startup(void);
@@ -313,6 +318,7 @@ static void pgss_store(const char *query, uint64 queryId,
int query_location, int query_len,
double total_time, uint64 rows,
const BufferUsage *bufusage,
+ const WalUsage* walusage,
pgssJumbleState *jstate);
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
@@ -841,6 +847,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
0,
0,
NULL,
+ NULL,
&jstate);
}
@@ -944,6 +951,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
queryDesc->totaltime->total * 1000.0,
/* convert to msec */
queryDesc->estate->es_processed,
&queryDesc->totaltime->bufusage,
+ &queryDesc->totaltime->walusage,
NULL);
}
@@ -989,7 +997,11 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char
*queryString,
BufferUsage bufusage_start,
bufusage;
+ WalUsage walusage_start,
+ walusage;
+
bufusage_start = pgBufferUsage;
+ walusage_start = pgWalUsage;
INSTR_TIME_SET_CURRENT(start);
nested_level++;
@@ -1041,6 +1053,9 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char
*queryString,
bufusage.blk_write_time = pgBufferUsage.blk_write_time;
INSTR_TIME_SUBTRACT(bufusage.blk_write_time,
bufusage_start.blk_write_time);
+ memset(&walusage, 0, sizeof(WalUsage));
+ WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
+
pgss_store(queryString,
0, /* signal that it's a
utility stmt */
pstmt->stmt_location,
@@ -1048,6 +1063,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char
*queryString,
INSTR_TIME_GET_MILLISEC(duration),
rows,
&bufusage,
+ &walusage,
NULL);
}
else
@@ -1083,13 +1099,14 @@ pgss_hash_string(const char *str, int len)
*
* If jstate is not NULL then we're trying to create an entry for which
* we have no statistics as yet; we just want to record the normalized
- * query string. total_time, rows, bufusage are ignored in this case.
+ * query string. Time and usage are ignored in this case.
*/
static void
pgss_store(const char *query, uint64 queryId,
int query_location, int query_len,
double total_time, uint64 rows,
const BufferUsage *bufusage,
+ const WalUsage* walusage,
pgssJumbleState *jstate)
{
pgssHashKey key;
@@ -1281,6 +1298,9 @@ pgss_store(const char *query, uint64 queryId,
e->counters.blk_read_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
e->counters.blk_write_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
e->counters.usage += USAGE_EXEC(total_time);
+ e->counters.wal_bytes += walusage->wal_bytes;
+ e->counters.wal_records += walusage->wal_records;
+ e->counters.wal_fpw_records += walusage->wal_fpw_records;
SpinLockRelease(&e->mutex);
}
@@ -1328,7 +1348,8 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
#define PG_STAT_STATEMENTS_COLS_V1_1 18
#define PG_STAT_STATEMENTS_COLS_V1_2 19
#define PG_STAT_STATEMENTS_COLS_V1_3 23
-#define PG_STAT_STATEMENTS_COLS 23 /* maximum of
above */
+#define PG_STAT_STATEMENTS_COLS_V1_4 26
+#define PG_STAT_STATEMENTS_COLS 26 /* maximum of
above */
/*
* Retrieve statement statistics.
@@ -1340,6 +1361,15 @@ pg_stat_statements_reset(PG_FUNCTION_ARGS)
* expected API version is identified by embedding it in the C name of the
* function. Unfortunately we weren't bright enough to do that for 1.1.
*/
+Datum pg_stat_statements_1_4(PG_FUNCTION_ARGS)
+{
+ bool showtext = PG_GETARG_BOOL(0);
+
+ pg_stat_statements_internal(fcinfo, PGSS_V1_4, showtext);
+
+ return (Datum)0;
+}
+
Datum
pg_stat_statements_1_3(PG_FUNCTION_ARGS)
{
@@ -1445,6 +1475,10 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
if (api_version != PGSS_V1_3)
elog(ERROR, "incorrect number of output
arguments");
break;
+ case PG_STAT_STATEMENTS_COLS_V1_4:
+ if (api_version != PGSS_V1_4)
+ elog(ERROR, "incorrect number of output
arguments");
+ break;
default:
elog(ERROR, "incorrect number of output arguments");
}
@@ -1641,11 +1675,29 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
values[i++] = Float8GetDatumFast(tmp.blk_read_time);
values[i++] = Float8GetDatumFast(tmp.blk_write_time);
}
+ if (api_version >= PGSS_V1_4)
+ {
+ char buf[256];
+ Datum wal_bytes;
+
+ snprintf(buf, sizeof buf, UINT64_FORMAT, tmp.wal_bytes);
+
+ /* Convert to numeric. */
+ wal_bytes = DirectFunctionCall3(numeric_in,
+
CStringGetDatum(buf),
+
ObjectIdGetDatum(0),
+
Int32GetDatum(-1));
+
+ values[i++] = wal_bytes;
+ values[i++] = Int64GetDatumFast(tmp.wal_records);
+ values[i++] = UInt64GetDatum(tmp.wal_fpw_records);
+ }
Assert(i == (api_version == PGSS_V1_0 ?
PG_STAT_STATEMENTS_COLS_V1_0 :
api_version == PGSS_V1_1 ?
PG_STAT_STATEMENTS_COLS_V1_1 :
api_version == PGSS_V1_2 ?
PG_STAT_STATEMENTS_COLS_V1_2 :
api_version == PGSS_V1_3 ?
PG_STAT_STATEMENTS_COLS_V1_3 :
+ api_version == PGSS_V1_4 ?
PG_STAT_STATEMENTS_COLS_V1_4 :
-1 /* fail if you forget to update
this assert */ ));
tuplestore_putvalues(tupstore, tupdesc, values, nulls);
diff --git a/contrib/pg_stat_statements/pg_stat_statements.control
b/contrib/pg_stat_statements/pg_stat_statements.control
index 14cb422354..7fb20df886 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.control
+++ b/contrib/pg_stat_statements/pg_stat_statements.control
@@ -1,5 +1,5 @@
# pg_stat_statements extension
comment = 'track execution statistics of all SQL statements executed'
-default_version = '1.7'
+default_version = '1.8'
module_pathname = '$libdir/pg_stat_statements'
relocatable = true
diff --git a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
index 8b527070d4..7e4ac4f77d 100644
--- a/contrib/pg_stat_statements/sql/pg_stat_statements.sql
+++ b/contrib/pg_stat_statements/sql/pg_stat_statements.sql
@@ -99,7 +99,67 @@ SELECT * FROM test ORDER BY a;
-- SELECT with IN clause
SELECT * FROM test WHERE a IN (1, 2, 3, 4, 5);
-SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+-- SELECT check WAL usage stats to confirm temp tables do not get stored in WAL
+SELECT query, calls, rows, wal_bytes, wal_records, wal_fpw_records
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+--
+-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to validate WAL
generation metrics
+--
+SELECT pg_stat_statements_reset();
+
+-- utility "create table" should not be shown
+CREATE TABLE pgss_test (a int, b char(20));
+
+INSERT INTO pgss_test VALUES(generate_series(1, 10), 'aaa');
+UPDATE pgss_test SET b = 'bbb' WHERE a > 7;
+DELETE FROM pgss_test WHERE a > 9;
+
+-- explicit transaction
+BEGIN;
+UPDATE pgss_test SET b = '111' WHERE a = 1 ;
+COMMIT;
+
+BEGIN \;
+UPDATE pgss_test SET b = '222' WHERE a = 2 \;
+COMMIT ;
+
+UPDATE pgss_test SET b = '333' WHERE a = 3 \;
+UPDATE pgss_test SET b = '444' WHERE a = 4 ;
+
+BEGIN \;
+UPDATE pgss_test SET b = '555' WHERE a = 5 \;
+UPDATE pgss_test SET b = '666' WHERE a = 6 \;
+COMMIT ;
+
+-- many INSERT values
+INSERT INTO pgss_test (a, b) VALUES (1, 'a'), (2, 'b'), (3, 'c');
+
+-- SELECT with constants
+SELECT * FROM pgss_test WHERE a > 5 ORDER BY a ;
+
+SELECT *
+ FROM pgss_test
+ WHERE a > 9
+ ORDER BY a ;
+
+-- SELECT without constants
+SELECT * FROM pgss_test ORDER BY a;
+
+-- SELECT with IN clause
+SELECT * FROM pgss_test WHERE a IN (1, 2, 3, 4, 5);
+
+-- DROP test table
+SET pg_stat_statements.track_utility = TRUE;
+DROP TABLE pgss_test;
+SET pg_stat_statements.track_utility = FALSE;
+
+-- SELECT usage data, check WAL usage is reported, wal_records equal rows
count for INSERT/UPDATE/DELETE
+SELECT query, calls, rows,
+wal_bytes > 0 as wal_bytes_generated,
+(wal_records + wal_fpw_records) > 0 as wal_records_generated,
+(wal_records + wal_fpw_records) = rows as wal_records_as_rows
+FROM pg_stat_statements ORDER BY query COLLATE "C";
--
-- pg_stat_statements.track = none
@@ -144,7 +204,7 @@ $$ SELECT (i + 1.0)::INTEGER LIMIT 1 $$ LANGUAGE SQL;
SELECT PLUS_ONE(8);
SELECT PLUS_ONE(10);
-SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT query, calls, rows, wal_bytes, wal_records FROM pg_stat_statements
ORDER BY query COLLATE "C";
--
-- pg_stat_statements.track = all
diff --git a/doc/src/sgml/pgstatstatements.sgml
b/doc/src/sgml/pgstatstatements.sgml
index 26bb82da4a..80ad03b3da 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -221,6 +221,33 @@
</entry>
</row>
+ <row>
+ <entry><structfield>wal_bytes</structfield></entry>
+ <entry><type>numeric</type></entry>
+ <entry></entry>
+ <entry>
+ Total amount of WAL bytes generated by the statement
+ </entry>
+ </row>
+
+ <row>
+ <entry><structfield>wal_records</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>
+ Total count of WAL records generated by the statement
+ </entry>
+ </row>
+
+ <row>
+ <entry><structfield>wal_fp_records</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry></entry>
+ <entry>
+ Total count of WAL full page images generated by the statement
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
--
2.20.1
>From 455787d3ebf76cbe756292a3eac045c9761c47fc Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <[email protected]>
Date: Thu, 19 Mar 2020 16:08:47 +0100
Subject: [PATCH v7 4/4] Expose WAL usage counters in verbose (auto)vacuum
output.
Author: Julien Rouhaud
Reviewed-by: Fuji Masao
Discussion:
https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4zlxws_cza3...@mail.gmail.com
---
src/backend/access/heap/vacuumlazy.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/heap/vacuumlazy.c
b/src/backend/access/heap/vacuumlazy.c
index 03c43efc32..ca4f03f551 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -65,6 +65,7 @@
#include "commands/dbcommands.h"
#include "commands/progress.h"
#include "commands/vacuum.h"
+#include "executor/instrument.h"
#include "miscadmin.h"
#include "optimizer/paths.h"
#include "pgstat.h"
@@ -381,6 +382,8 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
int nindexes;
PGRUsage ru0;
TimestampTz starttime = 0;
+ WalUsage walusage_start = pgWalUsage;
+ WalUsage walusage = {0, 0, 0};
long secs;
int usecs;
double read_rate,
@@ -569,6 +572,9 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
TimestampDifference(starttime, endtime, &secs, &usecs);
+ memset(&walusage, 0, sizeof(WalUsage));
+ WalUsageAccumDiff(&walusage, &pgWalUsage,
&walusage_start);
+
read_rate = 0;
write_rate = 0;
if ((secs > 0) || (usecs > 0))
@@ -620,7 +626,13 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
(long long)
VacuumPageDirty);
appendStringInfo(&buf, _("avg read rate: %.3f MB/s, avg
write rate: %.3f MB/s\n"),
read_rate, write_rate);
- appendStringInfo(&buf, _("system usage: %s"),
pg_rusage_show(&ru0));
+ appendStringInfo(&buf, _("system usage: %s\n"),
pg_rusage_show(&ru0));
+ appendStringInfo(&buf,
+ _("WAL usage: %ld
records, %ld full page records, "
+ UINT64_FORMAT "
bytes"),
+ walusage.wal_records,
+
walusage.wal_fpw_records,
+ walusage.wal_bytes);
ereport(LOG,
(errmsg_internal("%s", buf.data)));
@@ -713,6 +725,8 @@ lazy_scan_heap(Relation onerel, VacuumParams *params,
LVRelStats *vacrelstats,
IndexBulkDeleteResult **indstats;
int i;
PGRUsage ru0;
+ WalUsage walusage_start = pgWalUsage;
+ WalUsage walusage = {0, 0, 0};
Buffer vmbuffer = InvalidBuffer;
BlockNumber next_unskippable_block;
bool skipping_blocks;
@@ -1690,6 +1704,15 @@ lazy_scan_heap(Relation onerel, VacuumParams *params,
LVRelStats *vacrelstats,
"%u
pages are entirely empty.\n",
empty_pages),
empty_pages);
+
+ memset(&walusage, 0, sizeof(WalUsage));
+ WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
+ appendStringInfo(&buf, _("%ld WAL records, %ld WAL full page records, "
+ UINT64_FORMAT " WAL
bytes\n"),
+ walusage.wal_records,
+
walusage.wal_fpw_records,
+ walusage.wal_bytes);
+
appendStringInfo(&buf, _("%s."), pg_rusage_show(&ru0));
ereport(elevel,
--
2.20.1