On 8/29/25 01:57, Peter Geoghegan wrote: > On Thu, Aug 28, 2025 at 7:52 PM Tomas Vondra <to...@vondra.me> wrote: >> Use this branch: >> >> https://github.com/tvondra/postgres/commits/index-prefetch-master/ >> >> and then Thomas' patch that increases the prefetch distance: >> >> >> https://www.postgresql.org/message-id/CA%2BhUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw%40mail.gmail.com >> >> (IIRC there's a trivial conflict in read_stream_reset.). > > I found it quite hard to apply Thomas' patch. There's actually 3 > patches, with 2 earlier patches needed for earlier in the thread. And, > there were significant merge conflicts to work around. >
I don't think the 2 earlier patches are needed, I only ever applied the one in the linked message. But you're right there were more merge conflicts, I forgot about that. Here's a patch that should apply on top of the prefetch branch. > I'm not sure that Thomas'/your patch to ameliorate the problem on the > read stream side is essential here. Perhaps Andres can just take a > look at the test case + feature branch, without the extra patches. > That way he'll be able to see whatever the immediate problem is, which > might be all we need. > AFAICS Andres was interested in reproducing the regression with an increased distance. Or maybe I got it wrong. regards -- Tomas Vondra
From 04d2cb5149c2e7e211b8efb0cdd1b3d2a67e97b9 Mon Sep 17 00:00:00 2001 From: Tomas Vondra <to...@vondra.me> Date: Fri, 29 Aug 2025 02:32:54 +0200 Subject: [PATCH vmunro] aio: Improve read_stream.c look-ahead heuristics C Previously we would reduce the look-ahead distance by one every time we got a cache hit, which sometimes performed poorly with mixed hit/miss patterns, especially if it was trapped at one. Instead, sustain the current distance until we've seen evidence that there is no window big enough to span the gap between rare IOs. In other words, we now use information from a much larger window to estimate the utility of looking far ahead. XXX Highly experimental! --- src/backend/storage/aio/read_stream.c | 35 ++++++++++++++++++--------- 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c index 8051745c232..7b009d65f8a 100644 --- a/src/backend/storage/aio/read_stream.c +++ b/src/backend/storage/aio/read_stream.c @@ -99,6 +99,7 @@ struct ReadStream int16 forwarded_buffers; int16 pinned_buffers; int16 distance; + int16 distance_sustain; int16 distance_old; int16 initialized_buffers; int read_buffers_flags; @@ -398,22 +399,36 @@ read_stream_start_pending_read(ReadStream *stream) /* Remember whether we need to wait before returning this buffer. */ if (!need_wait) { - /* Look-ahead distance decays, no I/O necessary. */ - if (stream->distance > 1) + /* + * Look-ahead distance decays if we haven't had any cache misses in a + * hypothetical window of recent accesses. + */ + if (stream->distance_sustain > 0) + stream->distance_sustain--; + else if (stream->distance > 1) stream->distance--; } else { - /* - * Remember to call WaitReadBuffers() before returning head buffer. - * Look-ahead distance will be adjusted after waiting. - */ + /* Remember to call WaitReadBuffers() before returning head buffer. */ stream->ios[io_index].buffer_index = buffer_index; if (++stream->next_io_index == stream->max_ios) stream->next_io_index = 0; Assert(stream->ios_in_progress < stream->max_ios); stream->ios_in_progress++; stream->seq_blocknum = stream->pending_read_blocknum + nblocks; + + /* Look-ahead distance doubles. */ + if (stream->distance > stream->max_pinned_buffers - stream->distance) + stream->distance = stream->max_pinned_buffers; + else + stream->distance += stream->distance; + + /* + * Don't let the distance begin to decay until we've seen no IOs over + * a hypothetical window of the maximum possible size. + */ + stream->distance_sustain = stream->max_pinned_buffers; } /* @@ -963,7 +978,6 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data) stream->ios[stream->oldest_io_index].buffer_index == oldest_buffer_index) { int16 io_index = stream->oldest_io_index; - int32 distance; /* wider temporary value, clamped below */ /* Sanity check that we still agree on the buffers. */ Assert(stream->ios[io_index].op.buffers == @@ -976,11 +990,6 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data) if (++stream->oldest_io_index == stream->max_ios) stream->oldest_io_index = 0; - /* Look-ahead distance ramps up rapidly after we do I/O. */ - distance = stream->distance * 2; - distance = Min(distance, stream->max_pinned_buffers); - stream->distance = distance; - /* * If we've reached the first block of a sequential region we're * issuing advice for, cancel that until the next jump. The kernel @@ -1138,6 +1147,8 @@ read_stream_reset(ReadStream *stream) stream->distance = Max(1, stream->distance_old); stream->distance_old = 0; + stream->distance_sustain = 0; + /* track the number of resets */ stream->reset_count += 1; } -- 2.51.0