Hi, On Tue, Mar 24, 2026 at 1:53 AM Jakub Wartak <[email protected]> wrote:
> On Tue, Mar 24, 2026 at 9:28 AM Lakshmi N <[email protected]> wrote: > > > > Hi all, > > > > Thank you for your feedback. Please see the attached patch. > > > > On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak < > [email protected]> wrote: > >> > >> On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy > >> <[email protected]> wrote: > >> > >> Hi, > >> > >> > On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM > >> > <[email protected]> wrote: > >> > > > >> > > > While investigating a long recovery, I noticed that XLogPrefetch > stats were not logged at the end of recovery. This log message will be > useful to understand how effective XLogPrefetch was during recovery. Adding > a patch to address this. > >> > > > >> > > Applied this patch and validated the log message. This log message > appears to be useful to me, particularly while doing fleet wide analysis. > >> > > > >> > > 2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats: > prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996 > >> > > >> > This looks useful to understand how the prefetch helped during long > recoveries. > >> > > >> > > I am wondering if we can periodically log this in standby mode as > well, not just before promoting? > >> > > >> > Timer-based startup progress messaging allows logging such things > >> > (ereport_startup_progress API). There was an attempt to enable "redo > >> > in progress" for standbys, but that seemed to flood the standby logs > >> > even at the default progress interval of 10 sec. > >> > > >> > Having said that, the prefetcher stats could be added to the existing > >> > ereport_startup_progress("redo in progress xxx") message that works > >> > for crash recoveries—however, I don't prefer doing a bunch of atomic > >> > reads every progress interval of 10 sec. > >> > >> > Therefore, logging at the end of recovery looks good to me. > >> > >> +1 from me too to only of logging at the end of recovery (so -1 to > logging > >> every now and then). If someone is interested in current state (or > progress > >> over time) I think he can query pg_stat_recovery_prefetch view already, > even > >> today, right? > >> > >> > I reviewed the patch. I have the following comment: > >> > > >> > + elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu, > >> > skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu", > >> > > >> > XLogPrefetcher is an internal data structure name, how about "redo > >> > prefetch stats: xxxx" to be consistent with other redo log messages? > >> > >> +1 > > > > > > please find the attached patch addressing this. > > Hi Lakshmi, > > The pg_stat_get_recovery_prefetch view has 3 additional columns, but they > are > showing current situation, maybe this $patch could be enhanced so that it > also > calculates averages for some of them? I mean perhaps not just hit ratio > would be > nice, but also the e.g. __average__ wal distance or average look ahead > (block distance). > I'm not sure maybe please also wait for input from others if they find > it a good idea > or not. > > Also if we are adding such pretty cryptic fields as "skip_fpw" to log, > we would need some > place in documentaiton to explain their meaining probably. > monitoring.sgml already > explains them, so maybe just link to that. I mean if you look what we > have today (below), > they are little less crypting and have different format, not > "skip_fpw=123" but more > like "17 removed" so maybe "17 skipped due to FPW". Maybe > > redo done at 0/75C7B290 system usage: CPU: user: 1.02 s, system: 0.51 > s, elapsed: 1.53 s > checkpoint complete: end-of-recovery fast wait: wrote 16289 buffers > (99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 17 removed, 0 > recycled; write=0.049 s, sync=0.191 s, total=0.264 s; sync files=10, > longest=0.187 s, average=0.020 s; distance=287186 kB, estimate=287186 > kB; lsn=0/75C7B2C8, redo lsn=0/75C7B2C8 > > so instead of like: > redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu, > skip_new=%lu, skip_fpw=%lu, skip_rep=%lu" > > something like below ones: > redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, .. > redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu > zero-initated, .. or something like that > Please find the attached patch with the suggested changes. I referenced [1] to log the message as suggested. 2026-03-24 04:53:15.251 PDT [18898] LOG: redo prefetch stats: prefetched 27 blocks, skipped 22 blocks because they were already in the buffer pool, skipped 17 blocks because they would be zero-initialized, skipped 0 blocks because they didn't exist yet, skipped 28 blocks because a full page image was included in the WAL, skipped 155 blocks because they were already recently prefetched. [1] https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH Regards, Lakshmi
0001-log-prefetch-stats-at-end-of-recovery.patch
Description: Binary data
