Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-06-02 Thread Andres Freund
Hi,

On 2026-06-02 12:15:40 -0400, Melanie Plageman wrote:
> On Mon, Apr 6, 2026 at 4:21 AM Michael Paquier  wrote:
> >
> > On Mon, Mar 30, 2026 at 11:25:55AM +0300, Yura Sokolov wrote:
> > > Have you any recommendations for patch?
> > >
> > > Should I create CF item for patch?
> >
> > The business with unset_flag_bits is new as of HEAD, so this could
> > qualify as an open item, to be tracked here:
> > https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items
> 
> There hasn't been recent activity on this thread and it is one of the
> oldest open items. Even if we don't have time to review the test right
> now, it is probably worth pushing the fix. Though we have just missed
> beta1...

Ugh. I had lost track of this.  I'll get back to it as soon as I have squared
the CI stuff away.

Greetings,

Andres Freund




Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-06-02 Thread Melanie Plageman
On Mon, Apr 6, 2026 at 4:21 AM Michael Paquier  wrote:
>
> On Mon, Mar 30, 2026 at 11:25:55AM +0300, Yura Sokolov wrote:
> > Have you any recommendations for patch?
> >
> > Should I create CF item for patch?
>
> The business with unset_flag_bits is new as of HEAD, so this could
> qualify as an open item, to be tracked here:
> https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items

There hasn't been recent activity on this thread and it is one of the
oldest open items. Even if we don't have time to review the test right
now, it is probably worth pushing the fix. Though we have just missed
beta1...

- Melanie




Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-04-06 Thread Michael Paquier
On Mon, Mar 30, 2026 at 11:25:55AM +0300, Yura Sokolov wrote:
> Have you any recommendations for patch?
> 
> Should I create CF item for patch?

The business with unset_flag_bits is new as of HEAD, so this could
qualify as an open item, to be tracked here:
https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items
--
Michael


signature.asc
Description: PGP signature


Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-03-31 Thread Yura Sokolov
27.03.2026 21:30, Yura Sokolov пишет:
> Good day,
> 
> 27.03.2026 12:04, Yura Sokolov wrote:
>> 26.03.2026 23:29, Andres Freund wrote:
>>> Hi,
>>>
>>> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
 UnlockBufHdrExt does:

buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;

 TerminateBufferIO unconditionally does:

unset_flag_bits |= BM_IO_ERROR;

 Due to this, AbortBufferIO and buffer_readv_complete_one are failed
 to set BM_IO_ERROR with call to TerminateBufferIO.

 It was found with proprietary code that was triggered on
 PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
>>>
>>> That's clearly not right.  Care to write a patch?  I think we should add a
>>> test for this in src/test/modules/test_aio too. As we don't rely on things
>>> like BM_IO_ERROR this is quite easy to not notice.
>> I thought, fix is too small to go the way of patches.
>> I don't mind if you just push it.
>>
>> But if I can save your time on writing test, I'll try.
>>
>> ...
>>
>> I believe, proper way is to add assertion in UnlockBufHdrExt:
>>
>> Assert(!(set_bits & unset_bits));
>>
>> And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
>> set_flag_bits.
>>
>> Is it ok?
> 
> Here patchset is.
> 
> I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
> failure.
> 
> I've also added FlushBuffer failure test in second patch (v1-002) using
> injection point. I don't know if 001-aio.pl is a good place for since write
> is not async yet.
> 
> v1-003 just mades DebugPrintBufferRefcount prettier.

rebased.

-- 
regards
Yura Sokolov aka funny-falconFrom 5cbaa035027d4f6c57d561ca4f9c5161f7a49471 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Fri, 27 Mar 2026 20:11:06 +0300
Subject: [PATCH v2 1/3] bufmgr: Fix possibility to set BM_IO_ERROR

Previously it couldn't be set because TerminateBufferIO added BM_IO_ERROR
to unset_flag_bits unconditionally, and UnlockBufHdrExt applied unset_bits
after set_bits.

Fix by not setting BM_IO_ERROR into unset_flag_bits if it is present in
set_flag_bits.

Also protect from possible similar errors by adding assertion to
UnlockBufHdrExt unset_bits and set_bits have no bits in common.

Modify src/test/modules/test_aio/t/001_aio.pl test_io_error to check
presence of BM_IO_ERROR.
---
 src/backend/storage/buffer/bufmgr.c |  4 +--
 src/include/storage/buf_internals.h |  1 +
 src/test/modules/test_aio/t/001_aio.pl  | 20 ++
 src/test/modules/test_aio/test_aio--1.0.sql |  4 +++
 src/test/modules/test_aio/test_aio.c| 30 +
 5 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index cd21ae3fc36..a81949aca7c 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -7347,8 +7347,8 @@ TerminateBufferIO(BufferDesc *buf, bool clear_dirty, uint64 set_flag_bits,
 	Assert(buf_state & BM_IO_IN_PROGRESS);
 	unset_flag_bits |= BM_IO_IN_PROGRESS;
 
-	/* Clear earlier errors, if this IO failed, it'll be marked again */
-	unset_flag_bits |= BM_IO_ERROR;
+	/* Clear earlier errors, unless this IO failed as well */
+	unset_flag_bits |= BM_IO_ERROR & ~set_flag_bits;
 
 	if (clear_dirty)
 		unset_flag_bits |= BM_DIRTY | BM_CHECKPOINT_NEEDED;
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index ad1b7b2216a..93887cea46d 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -475,6 +475,7 @@ UnlockBufHdrExt(BufferDesc *desc, uint64 old_buf_state,
 uint64 set_bits, uint64 unset_bits,
 int refcount_change)
 {
+	Assert(!(set_bits & unset_bits));
 	for (;;)
 	{
 		uint64		buf_state = old_buf_state;
diff --git a/src/test/modules/test_aio/t/001_aio.pl b/src/test/modules/test_aio/t/001_aio.pl
index 63cadd64c15..a48b0ddfba8 100644
--- a/src/test/modules/test_aio/t/001_aio.pl
+++ b/src/test/modules/test_aio/t/001_aio.pl
@@ -344,6 +344,8 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 		  $tblname eq 'tbl_corr'
 		  ? qr/invalid page in block 1 of relation "base\/\d+\/\d+/
 		  : qr/invalid page in block 1 of relation "base\/\d+\/t\d+_\d+/;
+		# BM_IO_ERROR and BM_TAG_VALID should be set
+		my $debug_print_re = qr/blockNum=1, flags=0x.?a00, refcount=0/;
 
 		# verify the error is reported in custom C code
 		psql_like(
@@ -354,6 +356,12 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 			qr/^$/,
 			$invalid_page_re);
 
+		psql_like(
+			$io_method, $psql,
+			"validate flags of $tblname page after read_rel_block_ll()",
+			qq(SELECT debug_print_rel_block('$tblname', 1)),
+			$debug_print_re, qr/^$/);
+
 		# verify the error is reported for bufmgr reads, seq scan
 		psql_like(
 			$io_method, $psql,
@@ -361,6 +369,12 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 			qq(SELECT count(*) 

Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-03-30 Thread Yura Sokolov
27.03.2026 21:30, Yura Sokolov wrote:
>> 26.03.2026 23:29, Andres Freund wrote:
>>>
>>> That's clearly not right.  Care to write a patch?
> 
> Here patchset is.
Good day, Andres.

Have you any recommendations for patch?

Should I create CF item for patch?

-- 
regards
Yura Sokolov aka funny-falcon




Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-03-27 Thread Yura Sokolov
Good day,

27.03.2026 12:04, Yura Sokolov wrote:
> 26.03.2026 23:29, Andres Freund wrote:
>> Hi,
>>
>> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
>>> UnlockBufHdrExt does:
>>>
>>>buf_state |= set_bits;
>>>buf_state &= ~unset_bits;
>>>buf_state &= ~BM_LOCKED;
>>>
>>> TerminateBufferIO unconditionally does:
>>>
>>>unset_flag_bits |= BM_IO_ERROR;
>>>
>>> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
>>> to set BM_IO_ERROR with call to TerminateBufferIO.
>>>
>>> It was found with proprietary code that was triggered on
>>> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
>>
>> That's clearly not right.  Care to write a patch?  I think we should add a
>> test for this in src/test/modules/test_aio too. As we don't rely on things
>> like BM_IO_ERROR this is quite easy to not notice.
> I thought, fix is too small to go the way of patches.
> I don't mind if you just push it.
> 
> But if I can save your time on writing test, I'll try.
> 
> ...
> 
> I believe, proper way is to add assertion in UnlockBufHdrExt:
> 
> Assert(!(set_bits & unset_bits));
> 
> And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
> set_flag_bits.
> 
> Is it ok?

Here patchset is.

I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
failure.

I've also added FlushBuffer failure test in second patch (v1-002) using
injection point. I don't know if 001-aio.pl is a good place for since write
is not async yet.

v1-003 just mades DebugPrintBufferRefcount prettier.

-- 
regards
Yura Sokolov aka funny-falconFrom ef62af742c30f7c3e0c0d0dad0504d18a13596f9 Mon Sep 17 00:00:00 2001
From: Yura Sokolov 
Date: Fri, 27 Mar 2026 20:11:06 +0300
Subject: [PATCH v1 1/3] bufmgr: Fix possibility to set BM_IO_ERROR

Previously it couldn't be set because TerminateBufferIO added BM_IO_ERROR
to unset_flag_bits unconditionally, and UnlockBufHdrExt applied unset_bits
after set_bits.

Fix by not setting BM_IO_ERROR into unset_flag_bits if it is present in
set_flag_bits.

Also protect from possible similar errors by adding assertion to
UnlockBufHdrExt unset_bits and set_bits have no bits in common.

Modify src/test/modules/test_aio/t/001_aio.pl test_io_error to check
presence of BM_IO_ERROR.
---
 src/backend/storage/buffer/bufmgr.c |  4 +--
 src/include/storage/buf_internals.h |  1 +
 src/test/modules/test_aio/t/001_aio.pl  | 20 ++
 src/test/modules/test_aio/test_aio--1.0.sql |  4 +++
 src/test/modules/test_aio/test_aio.c| 30 +
 5 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e212f6110f2..16b043cd681 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -7172,8 +7172,8 @@ TerminateBufferIO(BufferDesc *buf, bool clear_dirty, uint64 set_flag_bits,
 	Assert(buf_state & BM_IO_IN_PROGRESS);
 	unset_flag_bits |= BM_IO_IN_PROGRESS;
 
-	/* Clear earlier errors, if this IO failed, it'll be marked again */
-	unset_flag_bits |= BM_IO_ERROR;
+	/* Clear earlier errors, unless this IO failed as well */
+	unset_flag_bits |= BM_IO_ERROR & ~set_flag_bits;
 
 	if (clear_dirty)
 		unset_flag_bits |= BM_DIRTY | BM_CHECKPOINT_NEEDED;
diff --git a/src/include/storage/buf_internals.h b/src/include/storage/buf_internals.h
index 8d1e16b5d51..14a36cfe624 100644
--- a/src/include/storage/buf_internals.h
+++ b/src/include/storage/buf_internals.h
@@ -475,6 +475,7 @@ UnlockBufHdrExt(BufferDesc *desc, uint64 old_buf_state,
 uint64 set_bits, uint64 unset_bits,
 int refcount_change)
 {
+	Assert(!(set_bits & unset_bits));
 	for (;;)
 	{
 		uint64		buf_state = old_buf_state;
diff --git a/src/test/modules/test_aio/t/001_aio.pl b/src/test/modules/test_aio/t/001_aio.pl
index e18b2a2b8ae..c352dab5a41 100644
--- a/src/test/modules/test_aio/t/001_aio.pl
+++ b/src/test/modules/test_aio/t/001_aio.pl
@@ -328,6 +328,8 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 		  $tblname eq 'tbl_corr'
 		  ? qr/invalid page in block 1 of relation "base\/\d+\/\d+/
 		  : qr/invalid page in block 1 of relation "base\/\d+\/t\d+_\d+/;
+		# BM_IO_ERROR and BM_TAG_VALID should be set
+		my $debug_print_re = qr/blockNum=1, flags=0x.?a00, refcount=0/;
 
 		# verify the error is reported in custom C code
 		psql_like(
@@ -338,6 +340,12 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 			qr/^$/,
 			$invalid_page_re);
 
+		psql_like(
+			$io_method, $psql,
+			"validate flags of $tblname page after read_rel_block_ll()",
+			qq(SELECT debug_print_rel_block('$tblname', 1)),
+			$debug_print_re, qr/^$/);
+
 		# verify the error is reported for bufmgr reads, seq scan
 		psql_like(
 			$io_method, $psql,
@@ -345,6 +353,12 @@ SELECT modify_rel_block('tmp_corr', 1, corrupt_header=>true);
 			qq(SELECT count(*) FROM $tblname),
 			qr/^$/, $invalid_page_re);
 
+		psql_like(
+			$io_method, $psql,
+			"validate flags of

Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-03-27 Thread Yura Sokolov
26.03.2026 23:29, Andres Freund wrote:
> Hi,
> 
> On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
>> UnlockBufHdrExt does:
>>
>>buf_state |= set_bits;
>>buf_state &= ~unset_bits;
>>buf_state &= ~BM_LOCKED;
>>
>> TerminateBufferIO unconditionally does:
>>
>>unset_flag_bits |= BM_IO_ERROR;
>>
>> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
>> to set BM_IO_ERROR with call to TerminateBufferIO.
>>
>> It was found with proprietary code that was triggered on
>> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
> 
> That's clearly not right.  Care to write a patch?  I think we should add a
> test for this in src/test/modules/test_aio too. As we don't rely on things
> like BM_IO_ERROR this is quite easy to not notice.
I thought, fix is too small to go the way of patches.
I don't mind if you just push it.

But if I can save your time on writing test, I'll try.

...

I believe, proper way is to add assertion in UnlockBufHdrExt:

Assert(!(set_bits & unset_bits));

And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
set_flag_bits.

Is it ok?

-- 
regards
Yura Sokolov aka funny-falcon




Re: BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt

2026-03-26 Thread Andres Freund
Hi,

On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
> UnlockBufHdrExt does:
> 
>buf_state |= set_bits;
>buf_state &= ~unset_bits;
>buf_state &= ~BM_LOCKED;
> 
> TerminateBufferIO unconditionally does:
> 
>unset_flag_bits |= BM_IO_ERROR;
> 
> Due to this, AbortBufferIO and buffer_readv_complete_one are failed
> to set BM_IO_ERROR with call to TerminateBufferIO.
> 
> It was found with proprietary code that was triggered on
> PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.

That's clearly not right.  Care to write a patch?  I think we should add a
test for this in src/test/modules/test_aio too. As we don't rely on things
like BM_IO_ERROR this is quite easy to not notice.

Greetings,

Andres Freund