Some quick comments:
On 12/03/2024 15:02, Thomas Munro wrote:
src/backend/storage/aio/streaming_read.c
src/include/storage/streaming_read.h
Standard file header comments missing.
It would be nice to have a comment at the top of streaming_read.c,
explaining at a high level how the circular buffer, lookahead and all
that works. Maybe even some diagrams.
For example, what is head and what is tail? Before reading the code, I
assumed that 'head' was the next block range to return in
pg_streaming_read_buffer_get_next(). But I think it's actually the other
way round?
/*
* Create a new streaming read object that can be used to perform the
* equivalent of a series of ReadBuffer() calls for one fork of one relation.
* Internally, it generates larger vectored reads where possible by looking
* ahead.
*/
PgStreamingRead *
pg_streaming_read_buffer_alloc(int flags,
void *pgsr_private,
size_t
per_buffer_data_size,
BufferAccessStrategy
strategy,
BufferManagerRelation bmr,
ForkNumber forknum,
PgStreamingReadBufferCB next_block_cb)
I'm not a fan of the name, especially the 'alloc' part. Yeah, most of
the work it does is memory allocation. But I'd suggest something like
'pg_streaming_read_begin' instead.
Do we really need the pg_ prefix in these?
Buffer
pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)
Maybe 'pg_streaming_read_next_buffer' or just 'pg_streaming_read_next',
for a shorter name.
/*
* pgsr->ranges is a circular buffer. When it is empty, head == tail.
* When it is full, there is an empty element between head and tail.
Head
* can also be empty (nblocks == 0), therefore we need two extra
elements
* for non-occupied ranges, on top of max_pinned_buffers to allow for
the
* maxmimum possible number of occupied ranges of the smallest possible
* size of one.
*/
size = max_pinned_buffers + 2;
I didn't understand this explanation for why it's + 2.
/*
* Skip the initial ramp-up phase if the caller says we're going to be
* reading the whole relation. This way we start out doing full-sized
* reads.
*/
if (flags & PGSR_FLAG_FULL)
pgsr->distance = Min(MAX_BUFFERS_PER_TRANSFER,
pgsr->max_pinned_buffers);
else
pgsr->distance = 1;
Should this be "Max(MAX_BUFFERS_PER_TRANSFER,
pgsr->max_pinned_buffers)"? max_pinned_buffers cannot be smaller than
MAX_BUFFERS_PER_TRANSFER though, given how it's initialized earlier. So
perhaps just 'pgsr->distance = pgsr->max_pinned_buffers' ?
--
Heikki Linnakangas
Neon (https://neon.tech)