On 2021/03/22 16:50, Fujii Masao wrote:
>
>
> On 2021/03/22 9:50, ikedamsh wrote:
>> Agreed. I separated the patches.
>>
>> If only the former is committed, my trivial concern is that there may be
>> a disadvantage, but no advantage for the standby server. It may lead to
>> performance degradation to the wal receiver by calling
>> INSTR_TIME_SET_CURRENT(), but the stats can't visible for users until the
>> latter patch is committed.
>
> Your concern is valid, so let's polish and commit also the 0003 patch to v14.
> I'm still thinking that it's better to separate wal_xxx columns into
> walreceiver's and the others. But if we count even walreceiver activity on
> the existing columns, regarding 0003 patch, we need to update the document?
> For example, "Number of times WAL buffers were written out to disk via
> XLogWrite request." should be "Number of times WAL buffers were written
> out to disk via XLogWrite request and by WAL receiver process."? Maybe
> we need to append some descriptions about this into "WAL configuration"
> section?
Agreed. Users can know whether the stats is for walreceiver or not. The
pg_stat_wal view in standby server shows for the walreceiver, and in primary
server it shows for the others. So, I updated the document.
(v20-0003-Makes-the-wal-receiver-report-WAL-statistics.patch)
>> I followed the argument of pg_pwrite().
>> But, I think "char *" is better, so fixed it.
>
> Thanks for updating the patch!
>
> +extern int XLogWriteFile(int fd, char *buf,
> + size_t nbyte, off_t offset,
> + TimeLineID timelineid, XLogSegNo segno,
> + bool write_all);
>
> write_all seems not to be necessary. You added this flag for walreceiver,
> I guess. But even without the argument, walreceiver seems to work expectedly.
> So, what about the attached patch? I applied some cosmetic changes to the
> patch.
Thanks a lot. Yes, "write_all" is unnecessary.
Your patch is looks good to me.
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index db4b4e460c..281b13b9fa 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3493,7 +3493,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</para>
<para>
Number of times WAL buffers were written out to disk via
- <function>XLogWrite</function> request.
+ <function>XLogWrite</function> request and WAL data were written
+ out to disk by the WAL receiver process.
See <xref linkend="wal-configuration"/> for more information about
the internal WAL function <function>XLogWrite</function>.
</para></entry>
@@ -3521,7 +3522,8 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</para>
<para>
Total amount of time spent writing WAL buffers to disk via
- <function>XLogWrite</function> request, in milliseconds
+ <function>XLogWrite</function> request and WAL data to disk
+ by the WAL receiver process, in milliseconds
(if <xref linkend="guc-track-wal-io-timing"/> is enabled,
otherwise zero). This includes the sync time when
<varname>wal_sync_method</varname> is either
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index ae4a3c1293..39e7028c96 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -769,7 +769,7 @@
</para>
<para>
- There are two internal functions to write WAL data to disk:
+ There are two internal functions to write generated WAL data to disk:
<function>XLogWrite</function> and <function>issue_xlog_fsync</function>.
When <xref linkend="guc-track-wal-io-timing"/> is enabled, the total
amounts of time <function>XLogWrite</function> writes and
@@ -795,7 +795,19 @@
<function>issue_xlog_fsync</function> syncs WAL data to disk are also
counted as <literal>wal_write</literal> and <literal>wal_sync</literal>
in <structname>pg_stat_wal</structname>, respectively.
- </para>
+ To write replicated WAL data to disk by the WAL receiver is almost the same
+ as above except for some points. First, there is a dedicated code path for the
+ WAL receiver to write data although <function>issue_xlog_fsync</function> is
+ the same for syncing data.
+ Second, the WAL receiver writes replicated WAL data per bytes from the local
+ memory although the generated WAL data is written per WAL buffer pages.
+ The counters of <literal>wal_write</literal>, <literal>wal_sync</literal>,
+ <literal>wal_write_time</literal>, and <literal>wal_sync_time</literal> are
+ common statistics for writing/syncing both generated and replicated WAL data.
+ But, you can distinguish them because the generated WAL data is written/synced
+ in the primary server and the replicated WAL data is written/synced in
+ the standby server.
+ </para>
</sect1>
<sect1 id="wal-internals">
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index a7a94d2a83..df028c5039 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -771,6 +771,9 @@ WalRcvDie(int code, Datum arg)
/* Ensure that all WAL records received are flushed to disk */
XLogWalRcvFlush(true);
+ /* Send WAL statistics to the stats collector before terminating */
+ pgstat_send_wal(true);
+
/* Mark ourselves inactive in shared memory */
SpinLockAcquire(&walrcv->mutex);
Assert(walrcv->walRcvState == WALRCV_STREAMING ||
@@ -910,6 +913,12 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
XLogArchiveForceDone(xlogfname);
else
XLogArchiveNotify(xlogfname);
+
+ /*
+ * Send WAL statistics to the stats collector when finishing
+ * the current WAL segment file to avoid overloading it.
+ */
+ pgstat_send_wal(false);
}
recvFile = -1;