Hi,
On Mon, Mar 03, 2025 at 10:51:19AM +0900, Michael Paquier wrote:
> On Fri, Feb 28, 2025 at 10:39:31AM +0000, Bertrand Drouvot wrote:
> > That sounds a good idea to measure the impact of those extra calls and see
> > if we'd need to mitigate the impacts. I'll collect some data.
So I did some tests using only one walsender (given the fact that the extra
lock you mentioned above is "only" for this particular backend).
=== Test with pg_receivewal
I was using one pg_receivewal process and did some tests that way:
pgbench -n -c8 -j8 -T60 -f <(echo "SELECT pg_logical_emit_message(true, 'test',
repeat('0', 1));";)
I did not measure any noticeable extra lag (I did measure the time it took
for pg_size_pretty(sent_lsn - write_lsn) from pg_stat_replication to be back
to zero).
During the pgbench run a "perf record --call-graph fp -p <walsender_pid>" would
report (perf report -n):
1. pgstat_flush_backend() appears at about 3%
2. pg_memory_is_all_zeros() at about 2.8%
3. pgstat_flush_io() at about 0.4%
So it does not look like what we're adding here can be seen as a primary
bottleneck.
That said it looks like that there is room for improvment in
pgstat_flush_backend()
and that relying on a "have_iostats" like variable would be better than those
pg_memory_is_all_zeros() calls.
That's done in 0001 attached, by doing so, pgstat_flush_backend() now appears at
about 0.2%.
=== Test with pg_recvlogical
Now it does not look like pg_receivewal had a lot of IO stats to report
(looking at
pg_stat_get_backend_io() output for the walsender).
Doing the same test with "pg_recvlogical -d postgres -S logical_slot -f
/dev/null --start"
reports much more IO stats.
What I observe without the "have_iostats" optimization is:
1. I did not measure any noticeable extra lag
2. pgstat_flush_io() at about 5.5% (pgstat_io_flush_cb() at about 5.3%)
3 pgstat_flush_backend() at about 4.8%
and with the "have_iostats" optimization I now see pgstat_flush_backend() at
about 2.51%.
So it does not look like what we're adding here can be seen as a primary
bottleneck
but that is probably worth implementing the "have_iostats" optimization
attached.
Also, while I did not measure any noticeable extra lag, given the fact that
pgstat_flush_io() shows at about 5.5% and pgstat_flush_backend() at about 2.5%,
that could still make sense to reduce the frequency of the flush calls,
thoughts?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 226f767f90e82ee128319f43226d4d0056756fa9 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Tue, 25 Feb 2025 10:18:05 +0000
Subject: [PATCH v3 1/2] Flush the IO statistics of active walsenders
The walsender does not flush its IO statistics until it exits.
The issue is there since pg_stat_io has been introduced in a9c70b46dbe.
This commits:
1. ensures it does not wait to exit to flush its IO statistics
2. adds a test for a physical walsender (a logical walsender had the same issue
but the fix is in the same code path)
---
src/backend/replication/walsender.c | 7 +++++++
src/test/recovery/t/001_stream_rep.pl | 14 ++++++++++++++
2 files changed, 21 insertions(+)
26.8% src/backend/replication/
73.1% src/test/recovery/t/
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 446d10c1a7d..9ddf111af25 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -90,6 +90,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -2829,6 +2830,12 @@ WalSndLoop(WalSndSendDataCallback send_data)
/* Send keepalive if the time has come */
WalSndKeepaliveIfNecessary();
+ /*
+ * Report IO statistics
+ */
+ pgstat_flush_io(false);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+
/*
* Block if we have unsent data. XXX For logical replication, let
* WalSndWaitForWal() handle any other blocking; idle receivers need
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ee57d234c86..aea32f68b79 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -42,6 +42,9 @@ $node_standby_2->init_from_backup($node_standby_1, $backup_name,
has_streaming => 1);
$node_standby_2->start;
+# To check that an active walsender updates its IO statistics below.
+$node_primary->safe_psql('postgres', "SELECT pg_stat_reset_shared('io')");
+
# Create some content on primary and check its presence in standby nodes
$node_primary->safe_psql('postgres',
"CREATE TABLE tab_int AS SELECT generate_series(1,1002) AS a");
@@ -69,6 +72,17 @@ ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS;
$node_primary->wait_for_replay_catchup($node_standby_1);
$node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
+# Ensure an active walsender updates its IO statistics.
+is( $node_primary->safe_psql(
+ 'postgres',
+ qq(SELECT sum(reads) > 0
+ FROM pg_catalog.pg_stat_io
+ WHERE backend_type = 'walsender'
+ AND object = 'wal')
+ ),
+ qq(t),
+ "Check that the walsender updates its IO statistics");
+
my $result =
$node_standby_1->safe_psql('postgres', "SELECT count(*) FROM tab_int");
print "standby 1: $result\n";
--
2.34.1
>From 42cce18fd4565118c430bdd422f509361c8f86e5 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 3 Mar 2025 10:02:40 +0000
Subject: [PATCH v3 2/2] Add a new backend_has_iostats global variable
It behaves as the existing have_iostats and replace the existing pg_memory_is_all_zeros()
calls in flushing backend stats functions. Indeed some perf measurements report
that those calls are responsible for a large part of the flushing backend stats
functions.
---
src/backend/utils/activity/pgstat_backend.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
100.0% src/backend/utils/activity/
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a9343b7b59e..0ae3a1d4d23 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -37,6 +37,7 @@
* memory allocation.
*/
static PgStat_BackendPending PendingBackendStats;
+static bool backend_has_iostats = false;
/*
* Utility routines to report I/O stats for backends, kept here to avoid
@@ -68,6 +69,8 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+
+ backend_has_iostats = true;
}
/*
@@ -150,8 +153,7 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref)
* statistics. In this case, avoid unnecessarily modifying the stats
* entry.
*/
- if (pg_memory_is_all_zeros(&PendingBackendStats.pending_io,
- sizeof(struct PgStat_PendingIO)))
+ if (!backend_has_iostats)
return;
shbackendent = (PgStatShared_Backend *) entry_ref->shared_stats;
@@ -182,6 +184,8 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref)
* Clear out the statistics buffer, so it can be re-used.
*/
MemSet(&PendingBackendStats.pending_io, 0, sizeof(PgStat_PendingIO));
+
+ backend_has_iostats = false;
}
/*
@@ -198,8 +202,7 @@ pgstat_flush_backend(bool nowait, bits32 flags)
if (!pgstat_tracks_backend_bktype(MyBackendType))
return false;
- if (pg_memory_is_all_zeros(&PendingBackendStats,
- sizeof(struct PgStat_BackendPending)))
+ if (!pgstat_backend_have_pending_cb())
return false;
entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_BACKEND, InvalidOid,
@@ -222,8 +225,7 @@ pgstat_flush_backend(bool nowait, bits32 flags)
bool
pgstat_backend_have_pending_cb(void)
{
- return (!pg_memory_is_all_zeros(&PendingBackendStats,
- sizeof(struct PgStat_BackendPending)));
+ return backend_has_iostats;
}
/*
--
2.34.1