Hi Andres

Thanks for the AIO patch update. I gave it a try and ran into a FATAL
in bgwriter when executing a benchmark.

2024-09-12 01:38:00.851 PDT [2780939] PANIC:  no more bbs
2024-09-12 01:38:00.854 PDT [2780473] LOG:  background writer process
(PID 2780939) was terminated by signal 6: Aborted
2024-09-12 01:38:00.854 PDT [2780473] LOG:  terminating any other
active server processes

I debugged a bit and found that BgBufferSync() is not capping the
batch size under io_bounce_buffers like BufferSync() for checkpoint.
Here is a small patch to fix it.

Best regards
Robert




On Fri, Sep 6, 2024 at 12:47 PM Andres Freund <and...@anarazel.de> wrote:
>
> Hi,
>
> On 2024-09-05 01:37:34 +0800, 陈宗志 wrote:
> > I hope there can be a high-level design document that includes a
> > description, high-level architecture, and low-level design.
> > This way, others can also participate in reviewing the code.
>
> Yep, that was already on my todo list. The version I just posted includes
> that.
>
>
> > For example, which paths were modified in the AIO module?
> > Is it the path for writing WAL logs, or the path for flushing pages, etc.?
>
> I don't think it's good to document this in a design document - that's just
> bound to get out of date.
>
> For now the patchset causes AIO to be used for
>
> 1) all users of read_stream.h, e.g. sequential scans
>
> 2) bgwriter / checkpointer, mainly to have way to exercise the write path. As
>    mentioned in my email upthread, the code for that is in a somewhat rough
>    shape as Thomas Munro is working on a more general abstraction for some of
>    this.
>
> The earlier patchset added a lot more AIO uses because I needed to know all
> the design constraints. It e.g. added AIO use in WAL. While that allowed me to
> learn a lot, it's not something that makes sense to continue working on for
> now, as it requires a lot of work that's independent of AIO.  Thus I am
> focusing on the above users for now.
>
>
> > Also, I recommend keeping this patch as small as possible.
>
> Yep. That's my goal (as mentioned upthread).
>
>
> > For example, the first step could be to introduce libaio only, without
> > considering io_uring, as that would make it too complex.
>
> Currently the patchset doesn't contain libaio support and I am not planning to
> work on using libaio. Nor do I think it makes sense for anybody else to do so
> - libaio doesn't work for buffered IO, making it imo not particularly useful
> for us.
>
> The io_uring specific code isn't particularly complex / large compared to the
> main AIO infrastructure.
>
> Greetings,
>
> Andres Freund
>
>
From bd04bd18ce62cf3f88568d3578503d4efeeb6603 Mon Sep 17 00:00:00 2001
From: Robert Pang <robertp...@google.com>
Date: Thu, 12 Sep 2024 14:36:16 -0700
Subject: [PATCH] Fix BgBufferSync to limit batch size under io_bounce_buffers
 for bgwriter.

---
 src/backend/storage/buffer/bufmgr.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 44b1b6fb31..4cd959b295 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -3396,6 +3396,7 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context)
 	uint32		new_recent_alloc;
 
 	BuffersToWrite to_write;
+	int			max_combine;
 
 	/*
 	 * Find out where the freelist clock sweep currently is, and how many
@@ -3417,6 +3418,8 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context)
 		return true;
 	}
 
+	max_combine = Min(io_bounce_buffers, io_combine_limit);
+
 	/*
 	 * Compute strategy_delta = how many buffers have been scanned by the
 	 * clock sweep since last time.  If first time through, assume none. Then
@@ -3604,7 +3607,7 @@ BgBufferSync(IOQueue *ioq, WritebackContext *wb_context)
 		{
 			Assert(sync_state & BUF_REUSABLE);
 
-			if (to_write.nbuffers == io_combine_limit)
+			if (to_write.nbuffers == max_combine)
 			{
 				WriteBuffers(&to_write, ioq, wb_context);
 			}
-- 
2.46.0.662.g92d0881bb0-goog

Reply via email to