Hi, On Mon, Mar 03, 2025 at 10:51:19AM +0900, Michael Paquier wrote: > On Fri, Feb 28, 2025 at 10:39:31AM +0000, Bertrand Drouvot wrote: > > That sounds a good idea to measure the impact of those extra calls and see > > if we'd need to mitigate the impacts. I'll collect some data.
So I did some tests using only one walsender (given the fact that the extra lock you mentioned above is "only" for this particular backend). === Test with pg_receivewal I was using one pg_receivewal process and did some tests that way: pgbench -n -c8 -j8 -T60 -f <(echo "SELECT pg_logical_emit_message(true, 'test', repeat('0', 1));";) I did not measure any noticeable extra lag (I did measure the time it took for pg_size_pretty(sent_lsn - write_lsn) from pg_stat_replication to be back to zero). During the pgbench run a "perf record --call-graph fp -p <walsender_pid>" would report (perf report -n): 1. pgstat_flush_backend() appears at about 3% 2. pg_memory_is_all_zeros() at about 2.8% 3. pgstat_flush_io() at about 0.4% So it does not look like what we're adding here can be seen as a primary bottleneck. That said it looks like that there is room for improvment in pgstat_flush_backend() and that relying on a "have_iostats" like variable would be better than those pg_memory_is_all_zeros() calls. That's done in 0001 attached, by doing so, pgstat_flush_backend() now appears at about 0.2%. === Test with pg_recvlogical Now it does not look like pg_receivewal had a lot of IO stats to report (looking at pg_stat_get_backend_io() output for the walsender). Doing the same test with "pg_recvlogical -d postgres -S logical_slot -f /dev/null --start" reports much more IO stats. What I observe without the "have_iostats" optimization is: 1. I did not measure any noticeable extra lag 2. pgstat_flush_io() at about 5.5% (pgstat_io_flush_cb() at about 5.3%) 3 pgstat_flush_backend() at about 4.8% and with the "have_iostats" optimization I now see pgstat_flush_backend() at about 2.51%. So it does not look like what we're adding here can be seen as a primary bottleneck but that is probably worth implementing the "have_iostats" optimization attached. Also, while I did not measure any noticeable extra lag, given the fact that pgstat_flush_io() shows at about 5.5% and pgstat_flush_backend() at about 2.5%, that could still make sense to reduce the frequency of the flush calls, thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
>From 226f767f90e82ee128319f43226d4d0056756fa9 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot <bertranddrouvot...@gmail.com> Date: Tue, 25 Feb 2025 10:18:05 +0000 Subject: [PATCH v3 1/2] Flush the IO statistics of active walsenders The walsender does not flush its IO statistics until it exits. The issue is there since pg_stat_io has been introduced in a9c70b46dbe. This commits: 1. ensures it does not wait to exit to flush its IO statistics 2. adds a test for a physical walsender (a logical walsender had the same issue but the fix is in the same code path) --- src/backend/replication/walsender.c | 7 +++++++ src/test/recovery/t/001_stream_rep.pl | 14 ++++++++++++++ 2 files changed, 21 insertions(+) 26.8% src/backend/replication/ 73.1% src/test/recovery/t/ diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 446d10c1a7d..9ddf111af25 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -90,6 +90,7 @@ #include "utils/guc.h" #include "utils/memutils.h" #include "utils/pg_lsn.h" +#include "utils/pgstat_internal.h" #include "utils/ps_status.h" #include "utils/timeout.h" #include "utils/timestamp.h" @@ -2829,6 +2830,12 @@ WalSndLoop(WalSndSendDataCallback send_data) /* Send keepalive if the time has come */ WalSndKeepaliveIfNecessary(); + /* + * Report IO statistics + */ + pgstat_flush_io(false); + (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO); + /* * Block if we have unsent data. XXX For logical replication, let * WalSndWaitForWal() handle any other blocking; idle receivers need diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl index ee57d234c86..aea32f68b79 100644 --- a/src/test/recovery/t/001_stream_rep.pl +++ b/src/test/recovery/t/001_stream_rep.pl @@ -42,6 +42,9 @@ $node_standby_2->init_from_backup($node_standby_1, $backup_name, has_streaming => 1); $node_standby_2->start; +# To check that an active walsender updates its IO statistics below. +$node_primary->safe_psql('postgres', "SELECT pg_stat_reset_shared('io')"); + # Create some content on primary and check its presence in standby nodes $node_primary->safe_psql('postgres', "CREATE TABLE tab_int AS SELECT generate_series(1,1002) AS a"); @@ -69,6 +72,17 @@ ALTER EVENT TRIGGER on_login_trigger ENABLE ALWAYS; $node_primary->wait_for_replay_catchup($node_standby_1); $node_standby_1->wait_for_replay_catchup($node_standby_2, $node_primary); +# Ensure an active walsender updates its IO statistics. +is( $node_primary->safe_psql( + 'postgres', + qq(SELECT sum(reads) > 0 + FROM pg_catalog.pg_stat_io + WHERE backend_type = 'walsender' + AND object = 'wal') + ), + qq(t), + "Check that the walsender updates its IO statistics"); + my $result = $node_standby_1->safe_psql('postgres', "SELECT count(*) FROM tab_int"); print "standby 1: $result\n"; -- 2.34.1
>From 42cce18fd4565118c430bdd422f509361c8f86e5 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot <bertranddrouvot...@gmail.com> Date: Mon, 3 Mar 2025 10:02:40 +0000 Subject: [PATCH v3 2/2] Add a new backend_has_iostats global variable It behaves as the existing have_iostats and replace the existing pg_memory_is_all_zeros() calls in flushing backend stats functions. Indeed some perf measurements report that those calls are responsible for a large part of the flushing backend stats functions. --- src/backend/utils/activity/pgstat_backend.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) 100.0% src/backend/utils/activity/ diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c index a9343b7b59e..0ae3a1d4d23 100644 --- a/src/backend/utils/activity/pgstat_backend.c +++ b/src/backend/utils/activity/pgstat_backend.c @@ -37,6 +37,7 @@ * memory allocation. */ static PgStat_BackendPending PendingBackendStats; +static bool backend_has_iostats = false; /* * Utility routines to report I/O stats for backends, kept here to avoid @@ -68,6 +69,8 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context, PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt; PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes; + + backend_has_iostats = true; } /* @@ -150,8 +153,7 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref) * statistics. In this case, avoid unnecessarily modifying the stats * entry. */ - if (pg_memory_is_all_zeros(&PendingBackendStats.pending_io, - sizeof(struct PgStat_PendingIO))) + if (!backend_has_iostats) return; shbackendent = (PgStatShared_Backend *) entry_ref->shared_stats; @@ -182,6 +184,8 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref) * Clear out the statistics buffer, so it can be re-used. */ MemSet(&PendingBackendStats.pending_io, 0, sizeof(PgStat_PendingIO)); + + backend_has_iostats = false; } /* @@ -198,8 +202,7 @@ pgstat_flush_backend(bool nowait, bits32 flags) if (!pgstat_tracks_backend_bktype(MyBackendType)) return false; - if (pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))) + if (!pgstat_backend_have_pending_cb()) return false; entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_BACKEND, InvalidOid, @@ -222,8 +225,7 @@ pgstat_flush_backend(bool nowait, bits32 flags) bool pgstat_backend_have_pending_cb(void) { - return (!pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))); + return backend_has_iostats; } /* -- 2.34.1