On Thu, Mar 19, 2026 at 6:22 PM Andres Freund <[email protected]> wrote:
>
> Thinking about it more, I also got worried about the duplicating of
> logic. It's perhaps acceptable with the patches as-is, but we'll soon need
> something very similar for AIO writes. Then we'd end up with like 5 variants,
> because we'd still need the existing StartBufferIO() for some cases where we
> do want to wait (e.g. the edge case in ExtendBufferedRelShared()).
>
> In the attached prototype I replaced your patch introducing
> PrepareHeadBufferReadIO()/ PrepareAdditionalBufferReadIO() with one that
> instead revises StartBufferIO() to have the enum return value you introduced
> and a PgAioWaitRef * argument that callers that would like to asynchronously
> wait for the IO to complete (others pass NULL). There are some other cleanups
> in it too, see the commit message for more details.
I've come around to this. The aspect I like least is that io_wref is
used both as an output parameter _and_ a decision input parameter.
Some callers never want to wait on in-progress IO (and don't have to),
while others must eventually wait but can defer that waiting as long
as they have a wait reference. If they can't get a wait reference,
they have no way to wait later, so they must wait now. The presence of
io_wref indicates this difference.
I think it's important to express that less mechanically than in your
current header comment and comment in the else block of
StartSharedBufferIO() where we do the waiting. Explaining first—before
detailing argument combinations—why a caller would want to pass
io_wref might help.
However, I do think you need to enumerate the different combinations
of wait and io_wref (as you've done) to make it clear what they are.
I, for example, find it very confusing what wait == false and io_wref
not NULL would mean. If IO is in progress on the buffer and the
io_wref is not valid yet, the caller would get the expected
BUFFER_IO_IN_PROGRESS return value but io_wref won't be set. I could
see callers easily misinterpreting the API and passing this
combination when what they want is wait == true and io_wref not NULL
-- because they don't want to synchronously wait.
I don't have any good suggestions despite thinking about it, though.
Two other things about 0007:
for (int i = nblocks_done + 1; i < operation->nblocks; i++)
{
/* Must be consecutive block numbers. */
Assert(BufferGetBlockNumber(buffers[i - 1]) ==
BufferGetBlockNumber(buffers[i]) - 1);
status = StartBufferIO(buffers[nblocks_done], true, false, NULL);
Copy-paste bug above, should be StartBufferIO(buffers[i],...
I would mention that currently BUFFER_IO_IN_PROGRESS is not used in
the first StartBufferIO() case, so that is dead code as of this commit
> I also updated "Restructure AsyncReadBuffers()" to move
> pgstat_prepare_report_checksum_failure() and the computation of flags to
> before the ReadBuffersCanStartIO(). And added a comment explaining why little
> should be added between the ReadBuffersCanStartIO() calls.
>
> Thoughts?
Yea, definitely think the comment is important.
- Melanie