On Thu, Mar 4, 2021 at 12:14 PM Masahiro Ikeda <ikeda...@oss.nttdata.com> wrote:
> On 2021-03-03 20:27, Masahiro Ikeda wrote: > > On 2021-03-03 16:30, Fujii Masao wrote: > >> On 2021/03/03 14:33, Masahiro Ikeda wrote: > >>> On 2021-02-24 16:14, Fujii Masao wrote: > >>>> On 2021/02/15 11:59, Masahiro Ikeda wrote: > >>>>> On 2021-02-10 00:51, David G. Johnston wrote: > >>>>>> On Thu, Feb 4, 2021 at 4:45 PM Masahiro Ikeda > >>>>>> <ikeda...@oss.nttdata.com> wrote: > >>>>>> > >>>>>>> I pgindented the patches. > >>>>>> > >>>>>> ... <function>XLogWrite</function>, which is invoked during an > >>>>>> <function>XLogFlush</function> request (see ...). This is also > >>>>>> incremented by the WAL receiver during replication. > >>>>>> > >>>>>> ("which normally called" should be "which is normally called" or > >>>>>> "which normally is called" if you want to keep true to the > >>>>>> original) > >>>>>> You missed the adding the space before an opening parenthesis here > >>>>>> and > >>>>>> elsewhere (probably copy-paste) > >>>>>> > >>>>>> is ether -> is either > >>>>>> "This parameter is off by default as it will repeatedly query the > >>>>>> operating system..." > >>>>>> ", because" -> "as" > >>>>> > >>>>> Thanks, I fixed them. > >>>>> > >>>>>> wal_write_time and the sync items also need the note: "This is > >>>>>> also > >>>>>> incremented by the WAL receiver during replication." > >>>>> > >>>>> I skipped changing it since I separated the stats for the WAL > >>>>> receiver > >>>>> in pg_stat_wal_receiver. > >>>>> > >>>>>> "The number of times it happened..." -> " (the tally of this event > >>>>>> is > >>>>>> reported in wal_buffers_full in....) This is undesirable because > >>>>>> ..." > >>>>> > >>>>> Thanks, I fixed it. > >>>>> > >>>>>> I notice that the patch for WAL receiver doesn't require > >>>>>> explicitly > >>>>>> computing the sync statistics but does require computing the write > >>>>>> statistics. This is because of the presence of issue_xlog_fsync > >>>>>> but > >>>>>> absence of an equivalent pg_xlog_pwrite. Additionally, I observe > >>>>>> that > >>>>>> the XLogWrite code path calls pgstat_report_wait_*() while the WAL > >>>>>> receiver path does not. It seems technically straight-forward to > >>>>>> refactor here to avoid the almost-duplicated logic in the two > >>>>>> places, > >>>>>> though I suspect there may be a trade-off for not adding another > >>>>>> function call to the stack given the importance of WAL processing > >>>>>> (though that seems marginalized compared to the cost of actually > >>>>>> writing the WAL). Or, as Fujii noted, go the other way and don't > >>>>>> have > >>>>>> any shared code between the two but instead implement the WAL > >>>>>> receiver > >>>>>> one to use pg_stat_wal_receiver instead. In either case, this > >>>>>> half-and-half implementation seems undesirable. > >>>>> > >>>>> OK, as Fujii-san mentioned, I separated the WAL receiver stats. > >>>>> (v10-0002-Makes-the-wal-receiver-report-WAL-statistics.patch) > >>>> > >>>> Thanks for updating the patches! > >>>> > >>>> > >>>>> I added the infrastructure code to communicate the WAL receiver > >>>>> stats messages between the WAL receiver and the stats collector, > >>>>> and > >>>>> the stats for WAL receiver is counted in pg_stat_wal_receiver. > >>>>> What do you think? > >>>> > >>>> On second thought, this idea seems not good. Because those stats are > >>>> collected between multiple walreceivers, but other values in > >>>> pg_stat_wal_receiver is only related to the walreceiver process > >>>> running > >>>> at that moment. IOW, it seems strange that some values show dynamic > >>>> stats and the others show collected stats, even though they are in > >>>> the same view pg_stat_wal_receiver. Thought? > >>> > >>> OK, I fixed it. > >>> The stats collected in the WAL receiver is exposed in pg_stat_wal > >>> view in v11 patch. > >> > >> Thanks for updating the patches! I'm now reading 001 patch. > >> > >> + /* Check whether the WAL file was synced to disk right now */ > >> + if (enableFsync && > >> + (sync_method == SYNC_METHOD_FSYNC || > >> + sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH || > >> + sync_method == SYNC_METHOD_FDATASYNC)) > >> + { > >> > >> Isn't it better to make issue_xlog_fsync() return immediately > >> if enableFsync is off, sync_method is open_sync or open_data_sync, > >> to simplify the code more? > > > > Thanks for the comments. > > I added the above code in v12 patch. > > > >> > >> + /* > >> + * Send WAL statistics only if WalWriterDelay has elapsed > to > >> minimize > >> + * the overhead in WAL-writing. > >> + */ > >> + if (rc & WL_TIMEOUT) > >> + pgstat_send_wal(); > >> > >> On second thought, this change means that it always takes > >> wal_writer_delay > >> before walwriter's WAL stats is sent after XLogBackgroundFlush() is > >> called. > >> For example, if wal_writer_delay is set to several seconds, some > >> values in > >> pg_stat_wal would be not up-to-date meaninglessly for those seconds. > >> So I'm thinking to withdraw my previous comment and it's ok to send > >> the stats every after XLogBackgroundFlush() is called. Thought? > > > > Thanks, I didn't notice that. > > > > Although PGSTAT_STAT_INTERVAL is 500msec, wal_writer_delay's > > default value is 200msec and it may be set shorter time. > > > > Why don't to make another way to check the timestamp? > > > > + /* > > + * Don't send a message unless it's been at least > > PGSTAT_STAT_INTERVAL > > + * msec since we last sent one > > + */ > > + now = GetCurrentTimestamp(); > > + if (TimestampDifferenceExceeds(last_report, now, > > PGSTAT_STAT_INTERVAL)) > > + { > > + pgstat_send_wal(); > > + last_report = now; > > + } > > + > > > > Although I worried that it's better to add the check code in > > pgstat_send_wal(), > > I didn't do so because to avoid to double check PGSTAT_STAT_INTERVAL. > > pgstat_send_wal() is invoked pg_report_stat() and it already checks the > > PGSTAT_STAT_INTERVAL. > > I forgot to remove an unused variable. > The attached v13 patch is fixed. > > Regards > -- > Masahiro Ikeda > NTT DATA CORPORATION This patch set no longer applies http://cfbot.cputube.org/patch_32_2859.log Can we get a rebase? I am marking the patch "Waiting on Author" -- Ibrar Ahmed