Re: [BUG]: the walsender does not update its IO statistics until it exits

Bertrand Drouvot Mon, 03 Mar 2025 03:54:58 -0800

Hi,

On Mon, Mar 03, 2025 at 10:51:19AM +0900, Michael Paquier wrote:
> On Fri, Feb 28, 2025 at 10:39:31AM +0000, Bertrand Drouvot wrote:
> > That sounds a good idea to measure the impact of those extra calls and see
> > if we'd need to mitigate the impacts. I'll collect some data.


So I did some tests using only one walsender (given the fact that the extra
lock you mentioned above is "only" for this particular backend).

=== Test with pg_receivewal

I was using one pg_receivewal process and did some tests that way:

pgbench -n -c8 -j8 -T60 -f <(echo "SELECT pg_logical_emit_message(true, 'test', 
repeat('0', 1));";)

I did not measure any noticeable extra lag (I did measure the time it took
for pg_size_pretty(sent_lsn - write_lsn) from pg_stat_replication to be back
to zero).

During the pgbench run a "perf record --call-graph fp -p <walsender_pid>" would
report (perf report -n):

1. pgstat_flush_backend() appears at about 3%
2. pg_memory_is_all_zeros() at about 2.8%
3. pgstat_flush_io() at about 0.4%

So it does not look like what we're adding here can be seen as a primary 
bottleneck.

That said it looks like that there is room for improvment in 
pgstat_flush_backend()
and that relying on a "have_iostats" like variable would be better than those
pg_memory_is_all_zeros() calls.

That's done in 0001 attached, by doing so, pgstat_flush_backend() now appears at
about 0.2%.

=== Test with pg_recvlogical

Now it does not look like pg_receivewal had a lot of IO stats to report 
(looking at
pg_stat_get_backend_io() output for the walsender).

Doing the same test with "pg_recvlogical -d postgres -S logical_slot -f 
/dev/null  --start"
reports much more IO stats.

What I observe without the "have_iostats" optimization is:

1. I did not measure any noticeable extra lag
2. pgstat_flush_io() at about 5.5% (pgstat_io_flush_cb() at about 5.3%)
3  pgstat_flush_backend() at about 4.8%

and with the "have_iostats" optimization I now see pgstat_flush_backend() at
about 2.51%.

So it does not look like what we're adding here can be seen as a primary 
bottleneck
but that is probably worth implementing the "have_iostats" optimization 
attached.

Also, while I did not measure any noticeable extra lag, given the fact that 
pgstat_flush_io() shows at about 5.5% and pgstat_flush_backend() at about 2.5%,
that could still make sense to reduce the frequency of the flush calls, 
thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

>From 226f767f90e82ee128319f43226d4d0056756fa9 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Tue, 25 Feb 2025 10:18:05 +0000
Subject: [PATCH v3 1/2] Flush the IO statistics of active walsenders

The walsender does not flush its IO statistics until it exits.
The issue is there since pg_stat_io has been introduced in a9c70b46dbe.
This commits:

1. ensures it does not wait to exit to flush its IO statistics
2. adds a test for a physical walsender (a logical walsender had the same issue
but the fix is in the same code path)
---
 src/backend/replication/walsender.c   |  7 +++++++
 src/test/recovery/t/001_stream_rep.pl | 14 ++++++++++++++
 2 files changed, 21 insertions(+)
  26.8% src/backend/replication/
  73.1% src/test/recovery/t/

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 446d10c1a7d..9ddf111af25 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -90,6 +90,7 @@
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
 #include "utils/ps_status.h"
 #include "utils/timeout.h"
 #include "utils/timestamp.h"
@@ -2829,6 +2830,12 @@ WalSndLoop(WalSndSendDataCallback send_data)
 		/* Send keepalive if the time has come */
 		WalSndKeepaliveIfNecessary();
 
+		/*
+		 * Report IO statistics
+		 */
+		pgstat_flush_io(false);
+		(void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+
 		/*
 		 * Block if we have unsent data.  XXX For logical replication, let
 		 * WalSndWaitForWal() handle any other blocking; idle receivers need
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ee57d234c86..aea32f68b79 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -42,6 +42,9 @@ $node_standby_2->init_from_backup($node_standby_1, $backup_name,
 	has_streaming => 1);
 $node_standby_2->start;
 
+# To check that an active walsender updates its IO statistics below.
+$node_primary->safe_psql('postgres', "SELECT pg_stat_reset_shared('io')");
+
 # Create some content on primary and check its presence in standby nodes
 $node_primary->safe_psql('postgres',
 	"CREATE TABLE tab_int AS SELECT generate_series(1,1002) AS a");
@@ -69,6 +72,17 @@ ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS;
 $node_primary->wait_for_replay_catchup($node_standby_1);
 $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary);
 
+# Ensure an active walsender updates its IO statistics.
+is( $node_primary->safe_psql(
+		'postgres',
+		qq(SELECT sum(reads) > 0
+       FROM pg_catalog.pg_stat_io
+       WHERE backend_type = 'walsender'
+       AND object = 'wal')
+	),
+	qq(t),
+	"Check that the walsender updates its IO statistics");
+
 my $result =
   $node_standby_1->safe_psql('postgres', "SELECT count(*) FROM tab_int");
 print "standby 1: $result\n";
-- 
2.34.1

>From 42cce18fd4565118c430bdd422f509361c8f86e5 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Mon, 3 Mar 2025 10:02:40 +0000
Subject: [PATCH v3 2/2] Add a new backend_has_iostats global variable

It behaves as the existing have_iostats and replace the existing pg_memory_is_all_zeros()
calls in flushing backend stats functions. Indeed some perf measurements report
that those calls are responsible for a large part of the flushing backend stats
functions.
---
 src/backend/utils/activity/pgstat_backend.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)
 100.0% src/backend/utils/activity/

diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index a9343b7b59e..0ae3a1d4d23 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -37,6 +37,7 @@
  * memory allocation.
  */
 static PgStat_BackendPending PendingBackendStats;
+static bool backend_has_iostats = false;
 
 /*
  * Utility routines to report I/O stats for backends, kept here to avoid
@@ -68,6 +69,8 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
 
 	PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
 	PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+
+	backend_has_iostats = true;
 }
 
 /*
@@ -150,8 +153,7 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref)
 	 * statistics.  In this case, avoid unnecessarily modifying the stats
 	 * entry.
 	 */
-	if (pg_memory_is_all_zeros(&PendingBackendStats.pending_io,
-							   sizeof(struct PgStat_PendingIO)))
+	if (!backend_has_iostats)
 		return;
 
 	shbackendent = (PgStatShared_Backend *) entry_ref->shared_stats;
@@ -182,6 +184,8 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref)
 	 * Clear out the statistics buffer, so it can be re-used.
 	 */
 	MemSet(&PendingBackendStats.pending_io, 0, sizeof(PgStat_PendingIO));
+
+	backend_has_iostats = false;
 }
 
 /*
@@ -198,8 +202,7 @@ pgstat_flush_backend(bool nowait, bits32 flags)
 	if (!pgstat_tracks_backend_bktype(MyBackendType))
 		return false;
 
-	if (pg_memory_is_all_zeros(&PendingBackendStats,
-							   sizeof(struct PgStat_BackendPending)))
+	if (!pgstat_backend_have_pending_cb())
 		return false;
 
 	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_BACKEND, InvalidOid,
@@ -222,8 +225,7 @@ pgstat_flush_backend(bool nowait, bits32 flags)
 bool
 pgstat_backend_have_pending_cb(void)
 {
-	return (!pg_memory_is_all_zeros(&PendingBackendStats,
-									sizeof(struct PgStat_BackendPending)));
+	return backend_has_iostats;
 }
 
 /*
-- 
2.34.1

Re: [BUG]: the walsender does not update its IO statistics until it exits

Reply via email to