On 12/21/21 03:49, Tomas Vondra wrote:
On 12/21/21 02:01, Tom Lane wrote:
Tomas Vondra <tomas.von...@enterprisedb.com> writes:
OK, I did a quick test with two very simple benchmarks - simple select
from a sequence, and 'pgbench -N' on scale 1. Benchmark was on current
master, patched means SEQ_LOG_VALS was set to 1.
But ... pgbench -N doesn't use sequences at all, does it?
Probably inserts into a table with a serial column would constitute a
plausible real-world case.
D'oh! For some reason I thought pgbench has a sequence on the history
table, but clearly I was mistaken. There's another thinko, because after
inspecting pg_waldump output I realized "SEQ_LOG_VALS 1" actually logs
only every 2nd increment. So it should be "SEQ_LOG_VALS 0".
So I repeated the test fixing SEQ_LOG_VALS, and doing the pgbench with a
table like this:
create table test (a serial, b int);
and a script doing
insert into test (b) values (1);
The results look like this:
1) select nextval('s');
clients 1 4
------------------------------
master 39533 124998
patched 3748 9114
------------------------------
diff -91% -93%
2) insert into test (b) values (1);
clients 1 4
------------------------------
master 3718 9188
patched 3698 9209
------------------------------
diff 0% 0%
So the nextval() results are a bit worse, due to not caching 1/2 the
nextval calls. The -90% is roughly expected, due to generating about 32x
more WAL (and having to wait for commit).
But results for the more realistic insert workload are about the same as
before (i.e. no measurable difference). Also kinda expected, because
those transactions have to wait for WAL anyway.
Attached is a patch tweaking WAL logging - in wal_level=minimal we do
the same thing as now, in higher levels we log every sequence fetch.
After thinking about this a bit more, I think even the nextval workload
is not such a big issue, because we can set cache for the sequences.
Until now this had fairly limited impact, but it can significantly
reduce the performance drop caused by WAL-logging every sequence fetch.
I've repeated the nextval test on a different machine (the one I used
before is busy with something else), and the results look like this:
1) 1 client
cache 1 32 128
--------------------------------------
master 13975 14425 19886
patched 886 7900 18397
--------------------------------------
diff -94% -45% -7%
4) 4 clients
cache 1 32 128
-----------------------------------------
master 8338 12849 18248
patched 331 8124 18983
-----------------------------------------
diff -96% -37% 4%
So I think this makes it acceptable / manageable. Of course, this means
the values are much less monotonous (across backends), but I don't think
we really promised that. And I doubt anyone is really using sequences
like this (just nextval) in performance critical use cases.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From eeaa7cb36c69af048f0321e4883864ebe2542429 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.von...@postgresql.org>
Date: Wed, 22 Dec 2021 03:18:46 +0100
Subject: [PATCH 1/6] WAL-log individual sequence fetches
---
src/backend/commands/sequence.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 72bfdc07a4..0f309d0a4e 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -52,6 +52,9 @@
* We don't want to log each fetching of a value from a sequence,
* so we pre-log a few fetches in advance. In the event of
* crash we can lose (skip over) as many values as we pre-logged.
+ *
+ * We only pre-log fetches in wal_level=minimal. For higher levels we
+ * WAL-log every individual sequence increment, as if this was 0.
*/
#define SEQ_LOG_VALS 32
@@ -666,11 +669,18 @@ nextval_internal(Oid relid, bool check_permissions)
* WAL record to be written anyway, else replay starting from the
* checkpoint would fail to advance the sequence past the logged values.
* In this case we may as well fetch extra values.
+ *
+ * We only pre-log fetches in wal_level=minimal. For higher levels we
+ * WAL-log every individual sequence increment.
*/
if (log < fetch || !seq->is_called)
{
/* forced log to satisfy local demand for values */
- fetch = log = fetch + SEQ_LOG_VALS;
+ if (XLogIsNeeded())
+ fetch = log = fetch;
+ else
+ fetch = log = fetch + SEQ_LOG_VALS;
+
logit = true;
}
else
@@ -680,7 +690,11 @@ nextval_internal(Oid relid, bool check_permissions)
if (PageGetLSN(page) <= redoptr)
{
/* last update of seq was before checkpoint */
- fetch = log = fetch + SEQ_LOG_VALS;
+ if (XLogIsNeeded())
+ fetch = log = fetch;
+ else
+ fetch = log = fetch + SEQ_LOG_VALS;
+
logit = true;
}
}
--
2.31.1