Re: log XLogPrefetch stats at end of recovery

Jakub Wartak Tue, 24 Mar 2026 01:54:14 -0700

On Tue, Mar 24, 2026 at 9:28 AM Lakshmi N <[email protected]> wrote:
>
> Hi all,
>
> Thank you for your feedback. Please see the attached patch.
>
> On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak <[email protected]> 
> wrote:
>>
>> On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
>> <[email protected]> wrote:
>>
>> Hi,
>>
>> > On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
>> > <[email protected]> wrote:
>> > >
>> > > > While investigating a long recovery, I noticed that XLogPrefetch stats 
>> > > > were not logged at the end of recovery. This log message will be 
>> > > > useful to understand how effective XLogPrefetch was during recovery. 
>> > > > Adding a patch to address this.
>> > >
>> > > Applied this patch and validated the log message. This log message 
>> > > appears to be useful to me, particularly while doing fleet wide analysis.
>> > >
>> > > 2026-03-20 23:33:13.756 PDT [2265441] LOG:  XLogPrefetcher stats: 
>> > > prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996
>> >
>> > This looks useful to understand how the prefetch helped during long 
>> > recoveries.
>> >
>> > > I am wondering if we can periodically log this in standby mode as well, 
>> > > not just before promoting?
>> >
>> > Timer-based startup progress messaging allows logging such things
>> > (ereport_startup_progress API). There was an attempt to enable "redo
>> > in progress" for standbys, but that seemed to flood the standby logs
>> > even at the default progress interval of 10 sec.
>> >
>> > Having said that, the prefetcher stats could be added to the existing
>> > ereport_startup_progress("redo in progress xxx") message that works
>> > for crash recoveries—however, I don't prefer doing a bunch of atomic
>> > reads every progress interval of 10 sec.
>>
>> > Therefore, logging at the end of recovery looks good to me.
>>
>> +1 from me too to only of logging at the end of recovery (so -1 to logging
>> every now and then). If someone is interested in current state (or progress
>> over time) I think he can query pg_stat_recovery_prefetch view already, even
>> today, right?
>>
>> > I reviewed the patch. I have the following comment:
>> >
>> > + elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
>> > skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",
>> >
>> > XLogPrefetcher is an internal data structure name, how about "redo
>> > prefetch stats: xxxx" to be consistent with other redo log messages?
>>
>> +1
>
>
> please find the attached patch addressing this.


Hi Lakshmi,

The pg_stat_get_recovery_prefetch view has 3 additional columns, but they are
showing current situation, maybe this $patch could be enhanced so that it also
calculates averages for some of them? I mean perhaps not just hit ratio would be
nice, but also the e.g. __average__ wal distance or average look ahead
(block distance).
I'm not sure maybe please also wait for input from others if they find
it a good idea
or not.

Also if we are adding such pretty cryptic fields as "skip_fpw" to log,
we would need some
place in documentaiton to explain their meaining probably.
monitoring.sgml already
explains them, so maybe just link to that. I mean if you look what we
have today (below),
they are little less crypting and have different format, not
"skip_fpw=123" but more
like "17 removed" so maybe "17 skipped due to FPW". Maybe

redo done at 0/75C7B290 system usage: CPU: user: 1.02 s, system: 0.51
s, elapsed: 1.53 s
checkpoint complete: end-of-recovery fast wait: wrote 16289 buffers
(99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 17 removed, 0
recycled; write=0.049 s, sync=0.191 s, total=0.264 s; sync files=10,
longest=0.187 s, average=0.020 s; distance=287186 kB, estimate=287186
kB; lsn=0/75C7B2C8, redo lsn=0/75C7B2C8

so instead of like:
redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"

something like below ones:
redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
zero-initated, .. or something like that

-J.

Re: log XLogPrefetch stats at end of recovery

Reply via email to