On 2021/02/16 15:50, Michael Paquier wrote:
On Tue, Feb 16, 2021 at 12:43:42PM +0900, Fujii Masao wrote:
On 2021/02/16 6:28, Andres Freund wrote:
So what? It's just about free to initialize a spinlock, whether it's
using the fallback implementation or not. Initializing upon walsender
startup adds a lot of complications, because e.g. somebody could already
hold the spinlock because the previous walsender just disconnected, and
they were looking at the stats.
Okay.
Even if we initialize "writtenUpto" in WalRcvShmemInit(), WalReceiverMain()
still needs to initialize (reset to 0) by using pg_atomic_write_u64().
Yes, you have to do that.
Basically we should not acquire new spinlock while holding another spinlock,
to shorten the spinlock duration. Right? If yes, we need to move
pg_atomic_read_u64() of "writtenUpto" after the release of spinlock in
pg_stat_get_wal_receiver.
It would not matter much as a NULL tuple is returned as long as the
WAL receiver information is not ready to be displayed. The only
reason why all the fields are read before checking for
ready_to_display is that we can be sure that everything is consistent
with the PID. So reading writtenUpto before or after does not really
matter logically. I would just move it after the check, as you did
previously.
OK.
+ /*
+ * Read "writtenUpto" without holding a spinlock. So it may not be
+ * consistent with other WAL receiver's shared variables protected by a
+ * spinlock. This is OK because that variable is used only for
+ * informational purpose and should not be used for data integrity checks.
+ */
What about the following?
"Read "writtenUpto" without holding a spinlock. Note that it may not
be consistent with the other shared variables of the WAL receiver
protected by a spinlock, but this should not be used for data
integrity checks."
Sounds good. Attached is the updated version of the patch.
I agree that what has been done with MyProc->waitStart in 46d6e5f is
not safe, and that initialization should happen once at postmaster
startup, with a write(0) when starting the backend. There are two of
them in proc.c, one in twophase.c. Do you mind if I add an open item
for this one?
Yeah, please feel free to do that! BTW, I already posted the patch
addressing that issue, at [1].
[1] https://postgr.es/m/1df88660-6f08-cc6e-b7e2-f85296a2b...@oss.nttdata.com
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/replication/walreceiver.c
b/src/backend/replication/walreceiver.c
index 723f513d8b..9ec71238c4 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -249,7 +249,7 @@ WalReceiverMain(void)
SpinLockRelease(&walrcv->mutex);
- pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
+ pg_atomic_write_u64(&WalRcv->writtenUpto, 0);
/* Arrange to clean up at walreceiver exit */
on_shmem_exit(WalRcvDie, 0);
@@ -1325,7 +1325,6 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
state = WalRcv->walRcvState;
receive_start_lsn = WalRcv->receiveStart;
receive_start_tli = WalRcv->receiveStartTLI;
- written_lsn = pg_atomic_read_u64(&WalRcv->writtenUpto);
flushed_lsn = WalRcv->flushedUpto;
received_tli = WalRcv->receivedTLI;
last_send_time = WalRcv->lastMsgSendTime;
@@ -1345,6 +1344,14 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
if (pid == 0 || !ready_to_display)
PG_RETURN_NULL();
+ /*
+ * Read "writtenUpto" without holding a spinlock. Note that it may not
be
+ * consistent with the other shared variables of the WAL receiver
+ * protected by a spinlock, but this should not be used for data
integrity
+ * checks.
+ */
+ written_lsn = pg_atomic_read_u64(&WalRcv->writtenUpto);
+
/* determine result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
diff --git a/src/backend/replication/walreceiverfuncs.c
b/src/backend/replication/walreceiverfuncs.c
index 69b91a7dab..63e60478ea 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -63,6 +63,7 @@ WalRcvShmemInit(void)
MemSet(WalRcv, 0, WalRcvShmemSize());
WalRcv->walRcvState = WALRCV_STOPPED;
SpinLockInit(&WalRcv->mutex);
+ pg_atomic_init_u64(&WalRcv->writtenUpto, 0);
WalRcv->latch = NULL;
}
}
diff --git a/src/test/regress/expected/sysviews.out
b/src/test/regress/expected/sysviews.out
index 81bdacf59d..6d048e309c 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -83,6 +83,13 @@ select count(*) = 1 as ok from pg_stat_wal;
t
(1 row)
+-- We expect no walreceiver running in this test
+select count(*) = 0 as ok from pg_stat_wal_receiver;
+ ok
+----
+ t
+(1 row)
+
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';
diff --git a/src/test/regress/sql/sysviews.sql
b/src/test/regress/sql/sysviews.sql
index b9b875bc6a..dc8c9a3ac2 100644
--- a/src/test/regress/sql/sysviews.sql
+++ b/src/test/regress/sql/sysviews.sql
@@ -40,6 +40,9 @@ select count(*) >= 0 as ok from pg_prepared_xacts;
-- There must be only one record
select count(*) = 1 as ok from pg_stat_wal;
+-- We expect no walreceiver running in this test
+select count(*) = 0 as ok from pg_stat_wal_receiver;
+
-- This is to record the prevailing planner enable_foo settings during
-- a regression test run.
select name, setting from pg_settings where name like 'enable%';