On 2021/02/11 21:55, Michael Paquier wrote:
Hi Fujii-san,
On Tue, Feb 09, 2021 at 11:17:04PM +0900, Fujii Masao wrote:
ISTM that the commit 2c8dd05d6c caused this issue. The commit changed
pg_stat_get_wal_receiver() so that it reads "writtenUpto" by using
pg_atomic_read_u64(). But since "writtenUpto" is initialized only when
walreceiver starts up, reading "writtenUpto" before the startup of
walreceiver can cause the error.
Indeed, that's a problem. We should at least move that out of the
spin lock area.
Yes, so what about the attached patch?
We didn't notice this issue long time because no regression test checks
pg_stat_wal_receiver. So I included such test in the patch.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/replication/walreceiver.c
b/src/backend/replication/walreceiver.c
index 723f513d8b..9b32941a76 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1325,7 +1325,6 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
state = WalRcv->walRcvState;
receive_start_lsn = WalRcv->receiveStart;
receive_start_tli = WalRcv->receiveStartTLI;
- written_lsn = pg_atomic_read_u64(&WalRcv->writtenUpto);
flushed_lsn = WalRcv->flushedUpto;
received_tli = WalRcv->receivedTLI;
last_send_time = WalRcv->lastMsgSendTime;
@@ -1345,6 +1344,19 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
if (pid == 0 || !ready_to_display)
PG_RETURN_NULL();
+ /*
+ * We reach here if WAL receiver is ready to display (i.e.,
+ * ready_to_display == true). In this case we can ensure that the atomic
+ * variable "writtenUpto" has already been initialized by WAL receiver,
so
+ * we can safely read it here.
+ *
+ * Read "writtenUpto" without holding a spinlock. So it may not be
+ * consistent with other WAL receiver's shared variables protected by a
+ * spinlock. This is OK because that variable is used only for
+ * informational purpose and should not be used for data integrity
checks.
+ */
+ written_lsn = pg_atomic_read_u64(&WalRcv->writtenUpto);
+
/* determine result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
diff --git a/src/test/regress/expected/sysviews.out
b/src/test/regress/expected/sysviews.out
index 81bdacf59d..6d048e309c 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -83,6 +83,13 @@ select count(*) = 1 as ok from pg_stat_wal;
t
(1 row)
+-- We expect no walreceiver running in this test
+select count(*) = 0 as ok from pg_stat_wal_receiver;
+ ok
+----
+ t
+(1 row)
+
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
diff --git a/src/test/regress/sql/sysviews.sql
b/src/test/regress/sql/sysviews.sql
index b9b875bc6a..dc8c9a3ac2 100644
--- a/src/test/regress/sql/sysviews.sql
+++ b/src/test/regress/sql/sysviews.sql
@@ -40,6 +40,9 @@ select count(*) >= 0 as ok from pg_prepared_xacts;
-- There must be only one record
select count(*) = 1 as ok from pg_stat_wal;
+-- We expect no walreceiver running in this test
+select count(*) = 0 as ok from pg_stat_wal_receiver;
+
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';