Hi Andres Thanks for the AIO patch update. I gave it a try and ran into a FATAL in bgwriter when executing a benchmark.
2024-09-12 01:38:00.851 PDT [2780939] PANIC: no more bbs 2024-09-12 01:38:00.854 PDT [2780473] LOG: background writer process (PID 2780939) was terminated by signal 6: Aborted 2024-09-12 01:38:00.854 PDT [2780473] LOG: terminating any other active server processes I debugged a bit and found that BgBufferSync() is not capping the batch size under io_bounce_buffers like BufferSync() for checkpoint. Here is a small patch to fix it. Best regards Robert On Fri, Sep 6, 2024 at 12:47 PM Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On 2024-09-05 01:37:34 +0800, 陈宗志 wrote: > > I hope there can be a high-level design document that includes a > > description, high-level architecture, and low-level design. > > This way, others can also participate in reviewing the code. > > Yep, that was already on my todo list. The version I just posted includes > that. > > > > For example, which paths were modified in the AIO module? > > Is it the path for writing WAL logs, or the path for flushing pages, etc.? > > I don't think it's good to document this in a design document - that's just > bound to get out of date. > > For now the patchset causes AIO to be used for > > 1) all users of read_stream.h, e.g. sequential scans > > 2) bgwriter / checkpointer, mainly to have way to exercise the write path. As > mentioned in my email upthread, the code for that is in a somewhat rough > shape as Thomas Munro is working on a more general abstraction for some of > this. > > The earlier patchset added a lot more AIO uses because I needed to know all > the design constraints. It e.g. added AIO use in WAL. While that allowed me to > learn a lot, it's not something that makes sense to continue working on for > now, as it requires a lot of work that's independent of AIO. Thus I am > focusing on the above users for now. > > > > Also, I recommend keeping this patch as small as possible. > > Yep. That's my goal (as mentioned upthread). > > > > For example, the first step could be to introduce libaio only, without > > considering io_uring, as that would make it too complex. > > Currently the patchset doesn't contain libaio support and I am not planning to > work on using libaio. Nor do I think it makes sense for anybody else to do so > - libaio doesn't work for buffered IO, making it imo not particularly useful > for us. > > The io_uring specific code isn't particularly complex / large compared to the > main AIO infrastructure. > > Greetings, > > Andres Freund > >
From bd04bd18ce62cf3f88568d3578503d4efeeb6603 Mon Sep 17 00:00:00 2001 From: Robert Pang <robertp...@google.com> Date: Thu, 12 Sep 2024 14:36:16 -0700 Subject: [PATCH] Fix BgBufferSync to limit batch size under io_bounce_buffers for bgwriter. --- src/backend/storage/buffer/bufmgr.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 44b1b6fb31..4cd959b295 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -3396,6 +3396,7 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context) uint32 new_recent_alloc; BuffersToWrite to_write; + int max_combine; /* * Find out where the freelist clock sweep currently is, and how many @@ -3417,6 +3418,8 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context) return true; } + max_combine = Min(io_bounce_buffers, io_combine_limit); + /* * Compute strategy_delta = how many buffers have been scanned by the * clock sweep since last time. If first time through, assume none. Then @@ -3604,7 +3607,7 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context) { Assert(sync_state & BUF_REUSABLE); - if (to_write.nbuffers == io_combine_limit) + if (to_write.nbuffers == max_combine) { WriteBuffers(&to_write, ioq, wb_context); } -- 2.46.0.662.g92d0881bb0-goog