On 2021-May-13, Tom Lane wrote:
> BTW, another nasty thing I discovered while testing this is that
> the CHECK_FOR_INTERRUPTS() at line 2146 is useless, because
> we're holding a buffer lock there so InterruptHoldoffCount > 0.
> So once you get into this loop you can't even cancel the query.
> Seems like that needs a fix, too.
This comment made me remember a patch I've had for a while, which splits
the CHECK_FOR_INTERRUPTS() definition in two -- one of them is
INTERRUPTS_PENDING_CONDITION() which let us test the condition
separately; that allows the lock we hold to be released prior to
actually processing the interrupts.
The btree code modified was found to be an actual problem in production
when a btree is corrupted in such a way that vacuum would get an
infinite loop. I don't remember the exact details but I think we saw
vacuum running for a couple of weeks, and had to restart the server in
order to terminate it (since it wouldn't respond to signals).
--
Álvaro Herrera Valdivia, Chile
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this." (Fotis)
(http://archives.postgresql.org/pgsql-sql/2006-06/msg00265.php)
>From 5a008141f135bef5ba933b1e3b65c457f58ad85a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <[email protected]>
Date: Thu, 13 May 2021 11:41:19 -0400
Subject: [PATCH] Split CHECK_FOR_INTERRUPTS
This allows the condition to be checked even when in an interrupts-held
situation, so that we can exit that (eg. release some lock we know we're
holding) in order to process them.
---
src/backend/access/nbtree/nbtpage.c | 7 +++++++
src/include/miscadmin.h | 20 ++++++++------------
2 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index ebec8fa5b8..00de713035 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -2397,6 +2397,13 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
{
bool leftsibvalid = true;
+ if (INTERRUPTS_PENDING_CONDITION())
+ {
+ _bt_relbuf(rel, leafbuf);
+ ProcessInterrupts();
+ return false; /* should not occur */
+ }
+
/*
* Before we follow the link from the page that was the left
* sibling mere moments ago, validate its right link. This
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 95202d37af..c5c441c2e9 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -98,23 +98,19 @@ extern PGDLLIMPORT volatile uint32 CritSectionCount;
extern void ProcessInterrupts(void);
#ifndef WIN32
+#define INTERRUPTS_PENDING_CONDITION() \
+ (unlikely(InterruptPending))
+#else
+#define INTERRUPTS_PENDING_CONDITION() \
+ (unlikely(UNBLOCKED_SIGNAL_QUEUE()) ? pgwin32_dispatch_queued_signals() : 0, \
+ unlikely(InterruptPending))
+#endif
#define CHECK_FOR_INTERRUPTS() \
do { \
- if (unlikely(InterruptPending)) \
+ if (INTERRUPTS_PENDING_CONDITION()) \
ProcessInterrupts(); \
} while(0)
-#else /* WIN32 */
-
-#define CHECK_FOR_INTERRUPTS() \
-do { \
- if (unlikely(UNBLOCKED_SIGNAL_QUEUE())) \
- pgwin32_dispatch_queued_signals(); \
- if (unlikely(InterruptPending)) \
- ProcessInterrupts(); \
-} while(0)
-#endif /* WIN32 */
-
#define HOLD_INTERRUPTS() (InterruptHoldoffCount++)
--
2.20.1