On Tue, Apr 4, 2023 at 1:25 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Sorry for not looking at this sooner.  I am okay with the regex
> changes proposed in v5-0001 through 0003, but I think you need to
> take another mopup pass there.  Some specific complaints:
> * header comment for pg_regprefix has been falsified (s/malloc/palloc/)

Thanks.  Fixed.

> * in spell.c, regex_affix_deletion_callback could be got rid of

Done in a separate patch.  I wondered if regex_t should be included
directly as a member of that union inside AFFIX, but decided it should
keep using a pointer (just without the extra wrapper struct).  A
direct member would make the AFFIX slightly larger, and it would
require us to assume that regex_t is movable which it probably
actually is in practice I guess but that isn't written down anywhere
and it seemed strange to rely on it.

> * check other callers of pg_regerror for now-useless CHECK_FOR_INTERRUPTS

I found three of these to remove (jsonpath_gram.y, varlena.c, test_regex.c).

> In general there's a lot of comments referring to regexes being malloc'd.

There is also some remaining direct use of malloc() in
regc_pg_locale.c because "we mustn't lose control on out-of-memory".
At that time (2012) there was no MCXT_NO_OOM (2015), so we could
presumably bring that cache into an observable MemoryContext now too.
I haven't written a patch for that, though, because it's not in the
way of my recovery conflict mission.

> I'm disinclined to change the ones inside the engine, because as far as
> it knows it is still using malloc, but maybe we should work harder on
> our own comments.  In particular, it'd likely be useful to have something
> somewhere pointing out that pg_regfree is only needed when you can't
> get rid of the regex by context cleanup.  Maybe write a short section
> about memory management in backend/regex/README?

I'll try to write something for the README tomorrow.  Here's a new
version of the code changes.

> I've not really looked at 0004.

I'm hoping to get just the regex changes in ASAP, and then take a
little bit longer on the recovery conflict patch itself (v6-0005) on
the basis that it's bugfix work and not subject to the feature freeze.
From a21a43bf5b1ba073abb3238968b9f8d13b1b318a Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Wed, 4 Jan 2023 14:15:40 +1300
Subject: [PATCH v6 1/5] Use MemoryContext API for regex memory management.

Previously, regex_t objects' memory was managed with malloc() and free()
directly.  Switch to palloc()-based memory management instead.
Advantages:

 * memory used by cached regexes is now visible with MemoryContext
   observability tools

 * cleanup can be done automatically in certain failure modes
   (something that later commits will take advantage of)

 * cleanup can be done in bulk

On the downside, there may be more fragmentation (wasted memory) due to
per-regex MemoryContext objects.  This is a problem shared with other
cached objects in PostgreSQL and can probably be improved with later
tuning.

Thanks to Noah Misch for suggesting this general approach, which
unblocks later work on interrupts.

Reviewed-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
---
 src/backend/regex/regprefix.c  |  2 +-
 src/backend/utils/adt/regexp.c | 57 ++++++++++++++++++++++++----------
 src/include/regex/regcustom.h  |  6 ++--
 3 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/src/backend/regex/regprefix.c b/src/backend/regex/regprefix.c
index 221f02da63..c09b2a9778 100644
--- a/src/backend/regex/regprefix.c
+++ b/src/backend/regex/regprefix.c
@@ -32,7 +32,7 @@ static int	findprefix(struct cnfa *cnfa, struct colormap *cm,
  *	REG_EXACT: all strings satisfying the regex must match the same string
  *	or a REG_XXX error code
  *
- * In the non-failure cases, *string is set to a malloc'd string containing
+ * In the non-failure cases, *string is set to a palloc'd string containing
  * the common prefix or exact value, of length *slength (measured in chrs
  * not bytes!).
  *
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 810dcb85b6..81400ba150 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -96,9 +96,13 @@ typedef struct regexp_matches_ctx
 #define MAX_CACHED_RES	32
 #endif
 
+/* A parent memory context for regular expressions. */
+static MemoryContext RegexpCacheMemoryContext;
+
 /* this structure describes one cached regular expression */
 typedef struct cached_re_str
 {
+	MemoryContext cre_context;	/* memory context for this regexp */
 	char	   *cre_pat;		/* original RE (not null terminated!) */
 	int			cre_pat_len;	/* length of original RE, in bytes */
 	int			cre_flags;		/* compile flags: extended,icase etc */
@@ -145,6 +149,7 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 	int			regcomp_result;
 	cached_re_str re_temp;
 	char		errMsg[100];
+	MemoryContext oldcontext;
 
 	/*
 	 * Look for a match among previously compiled REs.  Since the data
@@ -172,6 +177,13 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 		}
 	}
 
+	/* Set up the cache memory on first go through. */
+	if (unlikely(RegexpCacheMemoryContext == NULL))
+		RegexpCacheMemoryContext =
+			AllocSetContextCreate(TopMemoryContext,
+								  "RegexpCacheMemoryContext",
+								  ALLOCSET_SMALL_SIZES);
+
 	/*
 	 * Couldn't find it, so try to compile the new RE.  To avoid leaking
 	 * resources on failure, we build into the re_temp local.
@@ -183,6 +195,18 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 									   pattern,
 									   text_re_len);
 
+	/*
+	 * Make a memory context for this compiled regexp.  This is initially a
+	 * child of the current memory context, so it will be cleaned up
+	 * automatically if compilation is interrupted and throws an ERROR.
+	 * We'll re-parent it under the longer lived cache context if we make it
+	 * to the bottom of this function.
+	 */
+	re_temp.cre_context = AllocSetContextCreate(CurrentMemoryContext,
+												"RegexpMemoryContext",
+												ALLOCSET_SMALL_SIZES);
+	oldcontext = MemoryContextSwitchTo(re_temp.cre_context);
+
 	regcomp_result = pg_regcomp(&re_temp.cre_re,
 								pattern,
 								pattern_len,
@@ -209,21 +233,17 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 				 errmsg("invalid regular expression: %s", errMsg)));
 	}
 
+	/* Copy the pattern into the per-regexp memory context. */
+	re_temp.cre_pat = palloc(text_re_len + 1);
+	memcpy(re_temp.cre_pat, text_re_val, text_re_len);
+
 	/*
-	 * We use malloc/free for the cre_pat field because the storage has to
-	 * persist across transactions, and because we want to get control back on
-	 * out-of-memory.  The Max() is because some malloc implementations return
-	 * NULL for malloc(0).
+	 * NUL-terminate it only for the benefit of the identifier used for the
+	 * memory context, visible in the pg_backend_memory_contexts view.
 	 */
-	re_temp.cre_pat = malloc(Max(text_re_len, 1));
-	if (re_temp.cre_pat == NULL)
-	{
-		pg_regfree(&re_temp.cre_re);
-		ereport(ERROR,
-				(errcode(ERRCODE_OUT_OF_MEMORY),
-				 errmsg("out of memory")));
-	}
-	memcpy(re_temp.cre_pat, text_re_val, text_re_len);
+	re_temp.cre_pat[text_re_len] = 0;
+	MemoryContextSetIdentifier(re_temp.cre_context, re_temp.cre_pat);
+
 	re_temp.cre_pat_len = text_re_len;
 	re_temp.cre_flags = cflags;
 	re_temp.cre_collation = collation;
@@ -236,16 +256,21 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 	{
 		--num_res;
 		Assert(num_res < MAX_CACHED_RES);
-		pg_regfree(&re_array[num_res].cre_re);
-		free(re_array[num_res].cre_pat);
+		/* Delete the memory context holding the regexp and pattern. */
+		MemoryContextDelete(re_array[num_res].cre_context);
 	}
 
+	/* Re-parent the memory context to our long-lived cache context. */
+	MemoryContextSetParent(re_temp.cre_context, RegexpCacheMemoryContext);
+
 	if (num_res > 0)
 		memmove(&re_array[1], &re_array[0], num_res * sizeof(cached_re_str));
 
 	re_array[0] = re_temp;
 	num_res++;
 
+	MemoryContextSwitchTo(oldcontext);
+
 	return &re_array[0].cre_re;
 }
 
@@ -1990,7 +2015,7 @@ regexp_fixed_prefix(text *text_re, bool case_insensitive, Oid collation,
 	slen = pg_wchar2mb_with_len(str, result, slen);
 	Assert(slen < maxlen);
 
-	free(str);
+	pfree(str);
 
 	return result;
 }
diff --git a/src/include/regex/regcustom.h b/src/include/regex/regcustom.h
index fc158e1bb7..8f4025128e 100644
--- a/src/include/regex/regcustom.h
+++ b/src/include/regex/regcustom.h
@@ -49,9 +49,9 @@
 
 /* overrides for regguts.h definitions, if any */
 #define FUNCPTR(name, args) (*name) args
-#define MALLOC(n)		malloc(n)
-#define FREE(p)			free(VS(p))
-#define REALLOC(p,n)	realloc(VS(p),n)
+#define MALLOC(n)		palloc_extended((n), MCXT_ALLOC_NO_OOM)
+#define FREE(p)			pfree(VS(p))
+#define REALLOC(p,n)	repalloc_extended(VS(p),(n), MCXT_ALLOC_NO_OOM)
 #define assert(x)		Assert(x)
 
 /* internal character type and related */
-- 
2.39.2

From 6dde972500a352a382072c22f2750a351b6bed6b Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 4 Apr 2023 11:20:15 +1200
Subject: [PATCH v6 2/5] Update tsearch regex memory management.

Now that our regex engine uses palloc(), it's not necessary to set up a
special memory context callback to free compiled regexes.  The regex has
no resources other than the memory that is already going to be freed in
bulk.

Reviewed-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
---
 src/backend/tsearch/spell.c       | 34 +++++++------------------------
 src/include/tsearch/dicts/spell.h | 18 ++++++----------
 2 files changed, 13 insertions(+), 39 deletions(-)

diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index 8d48cad251..83a1836b44 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -655,17 +655,6 @@ FindWord(IspellDict *Conf, const char *word, const char *affixflag, int flag)
 	return 0;
 }
 
-/*
- * Context reset/delete callback for a regular expression used in an affix
- */
-static void
-regex_affix_deletion_callback(void *arg)
-{
-	aff_regex_struct *pregex = (aff_regex_struct *) arg;
-
-	pg_regfree(&(pregex->regex));
-}
-
 /*
  * Adds a new affix rule to the Affix field.
  *
@@ -728,7 +717,6 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
 		int			err;
 		pg_wchar   *wmask;
 		char	   *tmask;
-		aff_regex_struct *pregex;
 
 		Affix->issimple = 0;
 		Affix->isregis = 0;
@@ -743,31 +731,23 @@ NIAddAffix(IspellDict *Conf, const char *flag, char flagflags, const char *mask,
 		wmasklen = pg_mb2wchar_with_len(tmask, wmask, masklen);
 
 		/*
-		 * The regex engine stores its stuff using malloc not palloc, so we
-		 * must arrange to explicitly clean up the regex when the dictionary's
-		 * context is cleared.  That means the regex_t has to stay in a fixed
-		 * location within the context; we can't keep it directly in the AFFIX
-		 * struct, since we may sort and resize the array of AFFIXes.
+		 * The regex and all internal state created by pg_regcomp are allocated
+		 * in the dictionary's memory context, and will be freed automatically
+		 * when it is destroyed.
 		 */
-		Affix->reg.pregex = pregex = palloc(sizeof(aff_regex_struct));
-
-		err = pg_regcomp(&(pregex->regex), wmask, wmasklen,
+		Affix->reg.pregex = palloc(sizeof(regex_t));
+		err = pg_regcomp(Affix->reg.pregex, wmask, wmasklen,
 						 REG_ADVANCED | REG_NOSUB,
 						 DEFAULT_COLLATION_OID);
 		if (err)
 		{
 			char		errstr[100];
 
-			pg_regerror(err, &(pregex->regex), errstr, sizeof(errstr));
+			pg_regerror(err, Affix->reg.pregex, errstr, sizeof(errstr));
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
 					 errmsg("invalid regular expression: %s", errstr)));
 		}
-
-		pregex->mcallback.func = regex_affix_deletion_callback;
-		pregex->mcallback.arg = (void *) pregex;
-		MemoryContextRegisterResetCallback(CurrentMemoryContext,
-										   &pregex->mcallback);
 	}
 
 	Affix->flagflags = flagflags;
@@ -2161,7 +2141,7 @@ CheckAffix(const char *word, size_t len, AFFIX *Affix, int flagflags, char *neww
 		data = (pg_wchar *) palloc((newword_len + 1) * sizeof(pg_wchar));
 		data_len = pg_mb2wchar_with_len(newword, data, newword_len);
 
-		if (pg_regexec(&(Affix->reg.pregex->regex), data, data_len,
+		if (pg_regexec(Affix->reg.pregex, data, data_len,
 					   0, NULL, 0, NULL, 0) == REG_OKAY)
 		{
 			pfree(data);
diff --git a/src/include/tsearch/dicts/spell.h b/src/include/tsearch/dicts/spell.h
index 5c30af6ac6..82203edb31 100644
--- a/src/include/tsearch/dicts/spell.h
+++ b/src/include/tsearch/dicts/spell.h
@@ -81,17 +81,6 @@ typedef struct spell_struct
 
 #define SPELLHDRSZ	(offsetof(SPELL, word))
 
-/*
- * If an affix uses a regex, we have to store that separately in a struct
- * that won't move around when arrays of affixes are enlarged or sorted.
- * This is so that it can be found to be cleaned up at context destruction.
- */
-typedef struct aff_regex_struct
-{
-	regex_t		regex;
-	MemoryContextCallback mcallback;
-} aff_regex_struct;
-
 /*
  * Represents an entry in an affix list.
  */
@@ -108,7 +97,12 @@ typedef struct aff_struct
 	char	   *repl;
 	union
 	{
-		aff_regex_struct *pregex;
+		/*
+		 * Arrays of AFFIX are moved and sorted.  We'll use a pointer to
+		 * regex_t to keep this struct small, and avoid assuming that regex_t
+		 * is movable.
+		 */
+		regex_t	   *pregex;
 		Regis		regis;
 	}			reg;
 } AFFIX;
-- 
2.39.2

From e5cf8d820668c617993a667c847d56093bfc9686 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Sat, 14 Jan 2023 13:39:14 +1300
Subject: [PATCH v6 3/5] Update contrib/trgm_regexp's memory management.

While no code change was necessary for this code to keep working, we
don't need to use PG_TRY()/PG_FINALLY() with explicit clean-up while
working with regexes anymore.

Reviewed-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
---
 contrib/pg_trgm/trgm_regexp.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/contrib/pg_trgm/trgm_regexp.c b/contrib/pg_trgm/trgm_regexp.c
index 06cd3db67b..1d36946067 100644
--- a/contrib/pg_trgm/trgm_regexp.c
+++ b/contrib/pg_trgm/trgm_regexp.c
@@ -549,22 +549,9 @@ createTrgmNFA(text *text_re, Oid collation,
 			   REG_ADVANCED | REG_NOSUB, collation);
 #endif
 
-	/*
-	 * Since the regexp library allocates its internal data structures with
-	 * malloc, we need to use a PG_TRY block to ensure that pg_regfree() gets
-	 * done even if there's an error.
-	 */
-	PG_TRY();
-	{
-		trg = createTrgmNFAInternal(&regex, graph, rcontext);
-	}
-	PG_FINALLY();
-	{
-		pg_regfree(&regex);
-	}
-	PG_END_TRY();
+	trg = createTrgmNFAInternal(&regex, graph, rcontext);
 
-	/* Clean up all the cruft we created */
+	/* Clean up all the cruft we created (including regex) */
 	MemoryContextSwitchTo(oldcontext);
 	MemoryContextDelete(tmpcontext);
 
-- 
2.39.2

From 0d94e7ee4ada907337bb592bb1d8bd391538e3ea Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Fri, 13 Jan 2023 22:27:58 +1300
Subject: [PATCH v6 4/5] Redesign interrupt/cancel API for regex engine.

Previously, PostgreSQL's copy of the regex engine had a way to return a
special error code REG_CANCEL if it detected that the next call to
CHECK_FOR_INTERRUPTS() would certainly throw via ereport().

A later bugfix commit will move logic out of signal handlers, so that it
won't run until the next CHECK_FOR_INTERRUPTS(), which makes the above
design impossible unless we split CHECK_FOR_INTERRUPTS() into two
phases, one to run logic and another to ereport().  We may develop such
a system in the future, but for the regex code it is no longer
necessary.

An earlier commit moved regex memory management over to our
MemoryContext system.  Given that the purpose of the two-phase interrupt
checking was to free memory before throwing, something we don't need to
worry about anymore, it seems simpler to inject CHECK_FOR_INTERRUPTS()
directly into cancelation points, and just let it throw.

Since other projects using this code might want to stay in sync with us,
do this with a new macro INTERRUPT(), customizable in regcustom.h and
defaulting to nothing.

Reviewed-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
---
 src/backend/regex/regc_locale.c          |  6 +--
 src/backend/regex/regc_nfa.c             | 48 ++++--------------------
 src/backend/regex/regcomp.c              | 18 ---------
 src/backend/regex/rege_dfa.c             |  6 +--
 src/backend/regex/regexec.c              |  3 +-
 src/backend/utils/adt/jsonpath_gram.y    |  2 -
 src/backend/utils/adt/regexp.c           | 11 ------
 src/backend/utils/adt/varlena.c          |  1 -
 src/include/regex/regcustom.h            |  3 +-
 src/include/regex/regerrs.h              |  4 --
 src/include/regex/regex.h                |  1 -
 src/include/regex/regguts.h              |  9 +++--
 src/test/modules/test_regex/test_regex.c | 10 -----
 13 files changed, 18 insertions(+), 104 deletions(-)

diff --git a/src/backend/regex/regc_locale.c b/src/backend/regex/regc_locale.c
index b5f3a73b1b..77d1ce2816 100644
--- a/src/backend/regex/regc_locale.c
+++ b/src/backend/regex/regc_locale.c
@@ -475,11 +475,7 @@ range(struct vars *v,			/* context */
 			}
 			addchr(cv, cc);
 		}
-		if (CANCEL_REQUESTED(v->re))
-		{
-			ERR(REG_CANCEL);
-			return NULL;
-		}
+		INTERRUPT(v->re);
 	}
 
 	return cv;
diff --git a/src/backend/regex/regc_nfa.c b/src/backend/regex/regc_nfa.c
index 60fb0bec5d..f1819a24f6 100644
--- a/src/backend/regex/regc_nfa.c
+++ b/src/backend/regex/regc_nfa.c
@@ -143,11 +143,7 @@ newstate(struct nfa *nfa)
 	 * compilation, since no code path will go very long without making a new
 	 * state or arc.
 	 */
-	if (CANCEL_REQUESTED(nfa->v->re))
-	{
-		NERR(REG_CANCEL);
-		return NULL;
-	}
+	INTERRUPT(nfa->v->re);
 
 	/* first, recycle anything that's on the freelist */
 	if (nfa->freestates != NULL)
@@ -297,11 +293,7 @@ newarc(struct nfa *nfa,
 	 * compilation, since no code path will go very long without making a new
 	 * state or arc.
 	 */
-	if (CANCEL_REQUESTED(nfa->v->re))
-	{
-		NERR(REG_CANCEL);
-		return;
-	}
+	INTERRUPT(nfa->v->re);
 
 	/* check for duplicate arc, using whichever chain is shorter */
 	if (from->nouts <= to->nins)
@@ -825,11 +817,7 @@ moveins(struct nfa *nfa,
 		 * Because we bypass newarc() in this code path, we'd better include a
 		 * cancel check.
 		 */
-		if (CANCEL_REQUESTED(nfa->v->re))
-		{
-			NERR(REG_CANCEL);
-			return;
-		}
+		INTERRUPT(nfa->v->re);
 
 		sortins(nfa, oldState);
 		sortins(nfa, newState);
@@ -929,11 +917,7 @@ copyins(struct nfa *nfa,
 		 * Because we bypass newarc() in this code path, we'd better include a
 		 * cancel check.
 		 */
-		if (CANCEL_REQUESTED(nfa->v->re))
-		{
-			NERR(REG_CANCEL);
-			return;
-		}
+		INTERRUPT(nfa->v->re);
 
 		sortins(nfa, oldState);
 		sortins(nfa, newState);
@@ -1000,11 +984,7 @@ mergeins(struct nfa *nfa,
 	 * Because we bypass newarc() in this code path, we'd better include a
 	 * cancel check.
 	 */
-	if (CANCEL_REQUESTED(nfa->v->re))
-	{
-		NERR(REG_CANCEL);
-		return;
-	}
+	INTERRUPT(nfa->v->re);
 
 	/* Sort existing inarcs as well as proposed new ones */
 	sortins(nfa, s);
@@ -1125,11 +1105,7 @@ moveouts(struct nfa *nfa,
 		 * Because we bypass newarc() in this code path, we'd better include a
 		 * cancel check.
 		 */
-		if (CANCEL_REQUESTED(nfa->v->re))
-		{
-			NERR(REG_CANCEL);
-			return;
-		}
+		INTERRUPT(nfa->v->re);
 
 		sortouts(nfa, oldState);
 		sortouts(nfa, newState);
@@ -1226,11 +1202,7 @@ copyouts(struct nfa *nfa,
 		 * Because we bypass newarc() in this code path, we'd better include a
 		 * cancel check.
 		 */
-		if (CANCEL_REQUESTED(nfa->v->re))
-		{
-			NERR(REG_CANCEL);
-			return;
-		}
+		INTERRUPT(nfa->v->re);
 
 		sortouts(nfa, oldState);
 		sortouts(nfa, newState);
@@ -3282,11 +3254,7 @@ checkmatchall_recurse(struct nfa *nfa, struct state *s, bool **haspaths)
 		return false;
 
 	/* In case the search takes a long time, check for cancel */
-	if (CANCEL_REQUESTED(nfa->v->re))
-	{
-		NERR(REG_CANCEL);
-		return false;
-	}
+	INTERRUPT(nfa->v->re);
 
 	/* Create a haspath array for this state */
 	haspath = (bool *) MALLOC((DUPINF + 2) * sizeof(bool));
diff --git a/src/backend/regex/regcomp.c b/src/backend/regex/regcomp.c
index bb8c240598..8a6cfb2973 100644
--- a/src/backend/regex/regcomp.c
+++ b/src/backend/regex/regcomp.c
@@ -86,7 +86,6 @@ static int	newlacon(struct vars *v, struct state *begin, struct state *end,
 					 int latype);
 static void freelacons(struct subre *subs, int n);
 static void rfree(regex_t *re);
-static int	rcancelrequested(void);
 static int	rstacktoodeep(void);
 
 #ifdef REG_DEBUG
@@ -356,7 +355,6 @@ struct vars
 /* static function list */
 static const struct fns functions = {
 	rfree,						/* regfree insides */
-	rcancelrequested,			/* check for cancel request */
 	rstacktoodeep				/* check for stack getting dangerously deep */
 };
 
@@ -2468,22 +2466,6 @@ rfree(regex_t *re)
 	}
 }
 
-/*
- * rcancelrequested - check for external request to cancel regex operation
- *
- * Return nonzero to fail the operation with error code REG_CANCEL,
- * zero to keep going
- *
- * The current implementation is Postgres-specific.  If we ever get around
- * to splitting the regex code out as a standalone library, there will need
- * to be some API to let applications define a callback function for this.
- */
-static int
-rcancelrequested(void)
-{
-	return InterruptPending && (QueryCancelPending || ProcDiePending);
-}
-
 /*
  * rstacktoodeep - check for stack getting dangerously deep
  *
diff --git a/src/backend/regex/rege_dfa.c b/src/backend/regex/rege_dfa.c
index ba1289c64a..1f8f2ab144 100644
--- a/src/backend/regex/rege_dfa.c
+++ b/src/backend/regex/rege_dfa.c
@@ -805,11 +805,7 @@ miss(struct vars *v,
 	 * Checking for operation cancel in the inner text search loop seems
 	 * unduly expensive.  As a compromise, check during cache misses.
 	 */
-	if (CANCEL_REQUESTED(v->re))
-	{
-		ERR(REG_CANCEL);
-		return NULL;
-	}
+	INTERRUPT(v->re);
 
 	/*
 	 * What set of states would we end up in after consuming the co character?
diff --git a/src/backend/regex/regexec.c b/src/backend/regex/regexec.c
index 3d9ff2e607..2a1d5bebda 100644
--- a/src/backend/regex/regexec.c
+++ b/src/backend/regex/regexec.c
@@ -764,8 +764,7 @@ cdissect(struct vars *v,
 	MDEBUG(("%d: cdissect %c %ld-%ld\n", t->id, t->op, LOFF(begin), LOFF(end)));
 
 	/* handy place to check for operation cancel */
-	if (CANCEL_REQUESTED(v->re))
-		return REG_CANCEL;
+	INTERRUPT(v->re);
 	/* ... and stack overrun */
 	if (STACK_TOO_DEEP(v->re))
 		return REG_ETOOBIG;
diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y
index d34ad6b80d..adc259d5bf 100644
--- a/src/backend/utils/adt/jsonpath_gram.y
+++ b/src/backend/utils/adt/jsonpath_gram.y
@@ -553,8 +553,6 @@ makeItemLikeRegex(JsonPathParseItem *expr, JsonPathString *pattern,
 		{
 			char        errMsg[100];
 
-			/* See regexp.c for explanation */
-			CHECK_FOR_INTERRUPTS();
 			pg_regerror(re_result, &re_tmp, errMsg, sizeof(errMsg));
 			ereturn(escontext, false,
 					(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 81400ba150..5da3964746 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -218,15 +218,6 @@ RE_compile_and_cache(text *text_re, int cflags, Oid collation)
 	if (regcomp_result != REG_OKAY)
 	{
 		/* re didn't compile (no need for pg_regfree, if so) */
-
-		/*
-		 * Here and in other places in this file, do CHECK_FOR_INTERRUPTS
-		 * before reporting a regex error.  This is so that if the regex
-		 * library aborts and returns REG_CANCEL, we don't print an error
-		 * message that implies the regex was invalid.
-		 */
-		CHECK_FOR_INTERRUPTS();
-
 		pg_regerror(regcomp_result, &re_temp.cre_re, errMsg, sizeof(errMsg));
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
@@ -308,7 +299,6 @@ RE_wchar_execute(regex_t *re, pg_wchar *data, int data_len,
 	if (regexec_result != REG_OKAY && regexec_result != REG_NOMATCH)
 	{
 		/* re failed??? */
-		CHECK_FOR_INTERRUPTS();
 		pg_regerror(regexec_result, re, errMsg, sizeof(errMsg));
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
@@ -2001,7 +1991,6 @@ regexp_fixed_prefix(text *text_re, bool case_insensitive, Oid collation,
 
 		default:
 			/* re failed??? */
-			CHECK_FOR_INTERRUPTS();
 			pg_regerror(re_result, re, errMsg, sizeof(errMsg));
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index f9a607adaf..b571876468 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4265,7 +4265,6 @@ replace_text_regexp(text *src_text, text *pattern_text,
 		{
 			char		errMsg[100];
 
-			CHECK_FOR_INTERRUPTS();
 			pg_regerror(regexec_result, re, errMsg, sizeof(errMsg));
 			ereport(ERROR,
 					(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
diff --git a/src/include/regex/regcustom.h b/src/include/regex/regcustom.h
index 8f4025128e..bedee1e9ca 100644
--- a/src/include/regex/regcustom.h
+++ b/src/include/regex/regcustom.h
@@ -44,7 +44,7 @@
 
 #include "mb/pg_wchar.h"
 
-#include "miscadmin.h"			/* needed by rcancelrequested/rstacktoodeep */
+#include "miscadmin.h"			/* needed by stacktoodeep */
 
 
 /* overrides for regguts.h definitions, if any */
@@ -52,6 +52,7 @@
 #define MALLOC(n)		palloc_extended((n), MCXT_ALLOC_NO_OOM)
 #define FREE(p)			pfree(VS(p))
 #define REALLOC(p,n)	repalloc_extended(VS(p),(n), MCXT_ALLOC_NO_OOM)
+#define INTERRUPT(re)	CHECK_FOR_INTERRUPTS()
 #define assert(x)		Assert(x)
 
 /* internal character type and related */
diff --git a/src/include/regex/regerrs.h b/src/include/regex/regerrs.h
index 41e25f7ff0..2c8873eb81 100644
--- a/src/include/regex/regerrs.h
+++ b/src/include/regex/regerrs.h
@@ -81,7 +81,3 @@
 {
 	REG_ECOLORS, "REG_ECOLORS", "too many colors"
 },
-
-{
-	REG_CANCEL, "REG_CANCEL", "operation cancelled"
-},
diff --git a/src/include/regex/regex.h b/src/include/regex/regex.h
index 1297abec62..d08113724f 100644
--- a/src/include/regex/regex.h
+++ b/src/include/regex/regex.h
@@ -156,7 +156,6 @@ typedef struct
 #define REG_BADOPT	18			/* invalid embedded option */
 #define REG_ETOOBIG 19			/* regular expression is too complex */
 #define REG_ECOLORS 20			/* too many colors */
-#define REG_CANCEL	21			/* operation cancelled */
 /* two specials for debugging and testing */
 #define REG_ATOI	101			/* convert error-code name to number */
 #define REG_ITOA	102			/* convert error-code number to name */
diff --git a/src/include/regex/regguts.h b/src/include/regex/regguts.h
index 91a52840c4..3ca3647e11 100644
--- a/src/include/regex/regguts.h
+++ b/src/include/regex/regguts.h
@@ -77,6 +77,11 @@
 #define FREE(p)		free(VS(p))
 #endif
 
+/* interruption */
+#ifndef INTERRUPT
+#define INTERRUPT(re)
+#endif
+
 /* want size of a char in bits, and max value in bounded quantifiers */
 #ifndef _POSIX2_RE_DUP_MAX
 #define _POSIX2_RE_DUP_MAX	255 /* normally from <limits.h> */
@@ -510,13 +515,9 @@ struct subre
 struct fns
 {
 	void		FUNCPTR(free, (regex_t *));
-	int			FUNCPTR(cancel_requested, (void));
 	int			FUNCPTR(stack_too_deep, (void));
 };
 
-#define CANCEL_REQUESTED(re)  \
-	((*((struct fns *) (re)->re_fns)->cancel_requested) ())
-
 #define STACK_TOO_DEEP(re)	\
 	((*((struct fns *) (re)->re_fns)->stack_too_deep) ())
 
diff --git a/src/test/modules/test_regex/test_regex.c b/src/test/modules/test_regex/test_regex.c
index 1d4f79c9d3..d1dd48a993 100644
--- a/src/test/modules/test_regex/test_regex.c
+++ b/src/test/modules/test_regex/test_regex.c
@@ -185,15 +185,6 @@ test_re_compile(text *text_re, int cflags, Oid collation,
 	if (regcomp_result != REG_OKAY)
 	{
 		/* re didn't compile (no need for pg_regfree, if so) */
-
-		/*
-		 * Here and in other places in this file, do CHECK_FOR_INTERRUPTS
-		 * before reporting a regex error.  This is so that if the regex
-		 * library aborts and returns REG_CANCEL, we don't print an error
-		 * message that implies the regex was invalid.
-		 */
-		CHECK_FOR_INTERRUPTS();
-
 		pg_regerror(regcomp_result, result_re, errMsg, sizeof(errMsg));
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
@@ -239,7 +230,6 @@ test_re_execute(regex_t *re, pg_wchar *data, int data_len,
 	if (regexec_result != REG_OKAY && regexec_result != REG_NOMATCH)
 	{
 		/* re failed??? */
-		CHECK_FOR_INTERRUPTS();
 		pg_regerror(regexec_result, re, errMsg, sizeof(errMsg));
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
-- 
2.39.2

From c69e46f609aab169cad3483962188f6f05196602 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 10 May 2022 16:00:23 +1200
Subject: [PATCH v6 5/5] Fix recovery conflict SIGUSR1 handling.

We shouldn't be doing real work in a signal handler, to avoid reaching
code that is not safe in that context.  The previous coding also
confused the 'reason' shown in error messages by clobbering global
variables.  Move all recovery conflict checking logic into the next CFI,
and have the signal handler just set flags and the latch, following the
standard pattern.  Since there are several different reasons, use a
separate flag for each.

With this refactoring, the recovery conflict system no longer
piggy-backs on top of the regular query cancelation mechanisms, but
instead ereports directly if it decides that is necessary.  It still
needs to respect QueryCancelHoldoffCount, because otherwise the FEBE
protocol might be corrupted (see commit 2b3a8b20c2d).

For now we have agreed not to back-patch this change due to its
complexity and the regex changes that it depends on.

Reviewed-by: Andres Freund <and...@anarazel.de>
Reviewed-by: Michael Paquier <mich...@paquier.xyz>
Reviewed-by: Robert Haas <robertmh...@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
---
 src/backend/storage/buffer/bufmgr.c  |   4 +-
 src/backend/storage/ipc/procsignal.c |  12 +-
 src/backend/tcop/postgres.c          | 312 ++++++++++++++-------------
 src/include/storage/procsignal.h     |   4 +-
 src/include/tcop/tcopprot.h          |   3 +-
 5 files changed, 172 insertions(+), 163 deletions(-)

diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 908a8934bd..b902bbc2c6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4919,8 +4919,8 @@ LockBufferForCleanup(Buffer buffer)
 }
 
 /*
- * Check called from RecoveryConflictInterrupt handler when Startup
- * process requests cancellation of all pin holders that are blocking it.
+ * Check called from ProcessRecoveryConflictInterrupts() when Startup process
+ * requests cancellation of all pin holders that are blocking it.
  */
 bool
 HoldingBufferPinThatDelaysRecovery(void)
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 395b2cf690..e444296f2c 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -662,22 +662,22 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
 		HandleParallelApplyMessageInterrupt();
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_TABLESPACE))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_TABLESPACE);
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_LOCK))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOCK);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_LOCK);
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_SNAPSHOT);
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
 
 	if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
-		RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+		HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
 
 	SetLatch(MyLatch);
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index a10ecbaf50..6ec5a01bf4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -161,9 +161,8 @@ static bool EchoQuery = false;	/* -E switch */
 static bool UseSemiNewlineNewline = false;	/* -j switch */
 
 /* whether or not, and why, we were canceled by conflict with recovery */
-static bool RecoveryConflictPending = false;
-static bool RecoveryConflictRetryable = true;
-static ProcSignalReason RecoveryConflictReason;
+static volatile sig_atomic_t RecoveryConflictPending = false;
+static volatile sig_atomic_t RecoveryConflictPendingReasons[NUM_PROCSIGNALS];
 
 /* reused buffer to pass to SendRowDescriptionMessage() */
 static MemoryContext row_description_context = NULL;
@@ -182,7 +181,6 @@ static bool check_log_statement(List *stmt_list);
 static int	errdetail_execute(List *raw_parsetree_list);
 static int	errdetail_params(ParamListInfo params);
 static int	errdetail_abort(void);
-static int	errdetail_recovery_conflict(void);
 static void bind_param_error_callback(void *arg);
 static void start_xact_command(void);
 static void finish_xact_command(void);
@@ -2510,9 +2508,9 @@ errdetail_abort(void)
  * Add an errdetail() line showing conflict source.
  */
 static int
-errdetail_recovery_conflict(void)
+errdetail_recovery_conflict(ProcSignalReason reason)
 {
-	switch (RecoveryConflictReason)
+	switch (reason)
 	{
 		case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
 			errdetail("User was holding shared buffer pin for too long.");
@@ -3037,137 +3035,190 @@ FloatExceptionHandler(SIGNAL_ARGS)
 }
 
 /*
- * RecoveryConflictInterrupt: out-of-line portion of recovery conflict
- * handling following receipt of SIGUSR1. Designed to be similar to die()
- * and StatementCancelHandler(). Called only by a normal user backend
- * that begins a transaction during recovery.
+ * Tell the next CHECK_FOR_INTERRUPTS() to check for a particular type of
+ * recovery conflict.  Runs in a SIGUSR1 handler.
  */
 void
-RecoveryConflictInterrupt(ProcSignalReason reason)
+HandleRecoveryConflictInterrupt(ProcSignalReason reason)
 {
-	int			save_errno = errno;
+	RecoveryConflictPendingReasons[reason] = true;
+	RecoveryConflictPending = true;
+	InterruptPending = true;
+	/* latch will be set by procsignal_sigusr1_handler */
+}
 
-	/*
-	 * Don't joggle the elbow of proc_exit
-	 */
-	if (!proc_exit_inprogress)
+/*
+ * Check one individual conflict reason.
+ */
+static void
+ProcessRecoveryConflictInterrupt(ProcSignalReason reason)
+{
+	switch (reason)
 	{
-		RecoveryConflictReason = reason;
-		switch (reason)
-		{
-			case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
+		case PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK:
 
-				/*
-				 * If we aren't waiting for a lock we can never deadlock.
-				 */
-				if (!IsWaitingForLock())
-					return;
+			/*
+			 * If we aren't waiting for a lock we can never deadlock.
+			 */
+			if (!IsWaitingForLock())
+				return;
 
-				/* Intentional fall through to check wait for pin */
-				/* FALLTHROUGH */
+			/* Intentional fall through to check wait for pin */
+			/* FALLTHROUGH */
 
-			case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
+		case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN:
 
-				/*
-				 * If PROCSIG_RECOVERY_CONFLICT_BUFFERPIN is requested but we
-				 * aren't blocking the Startup process there is nothing more
-				 * to do.
-				 *
-				 * When PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK is
-				 * requested, if we're waiting for locks and the startup
-				 * process is not waiting for buffer pin (i.e., also waiting
-				 * for locks), we set the flag so that ProcSleep() will check
-				 * for deadlocks.
-				 */
-				if (!HoldingBufferPinThatDelaysRecovery())
-				{
-					if (reason == PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK &&
-						GetStartupBufferPinWaitBufId() < 0)
-						CheckDeadLockAlert();
-					return;
-				}
+			/*
+			 * If PROCSIG_RECOVERY_CONFLICT_BUFFERPIN is requested but we
+			 * aren't blocking the Startup process there is nothing more to
+			 * do.
+			 *
+			 * When PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK is requested,
+			 * if we're waiting for locks and the startup process is not
+			 * waiting for buffer pin (i.e., also waiting for locks), we set
+			 * the flag so that ProcSleep() will check for deadlocks.
+			 */
+			if (!HoldingBufferPinThatDelaysRecovery())
+			{
+				if (reason == PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK &&
+					GetStartupBufferPinWaitBufId() < 0)
+					CheckDeadLockAlert();
+				return;
+			}
 
-				MyProc->recoveryConflictPending = true;
+			MyProc->recoveryConflictPending = true;
 
-				/* Intentional fall through to error handling */
-				/* FALLTHROUGH */
+			/* Intentional fall through to error handling */
+			/* FALLTHROUGH */
+
+		case PROCSIG_RECOVERY_CONFLICT_LOCK:
+		case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
+		case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
 
-			case PROCSIG_RECOVERY_CONFLICT_LOCK:
-			case PROCSIG_RECOVERY_CONFLICT_TABLESPACE:
-			case PROCSIG_RECOVERY_CONFLICT_SNAPSHOT:
+			/*
+			 * If we aren't in a transaction any longer then ignore.
+			 */
+			if (!IsTransactionOrTransactionBlock())
+				return;
 
+			/*
+			 * If we're not in a subtransaction then we are OK to throw an
+			 * ERROR to resolve the conflict.  Otherwise drop through to the
+			 * FATAL case.
+			 *
+			 * XXX other times that we can throw just an ERROR *may* be
+			 * PROCSIG_RECOVERY_CONFLICT_LOCK if no locks are held in parent
+			 * transactions
+			 *
+			 * PROCSIG_RECOVERY_CONFLICT_SNAPSHOT if no snapshots are held by
+			 * parent transactions and the transaction is not
+			 * transaction-snapshot mode
+			 *
+			 * PROCSIG_RECOVERY_CONFLICT_TABLESPACE if no temp files or
+			 * cursors open in parent transactions
+			 */
+			if (!IsSubTransaction())
+			{
 				/*
-				 * If we aren't in a transaction any longer then ignore.
+				 * If we already aborted then we no longer need to cancel.  We
+				 * do this here since we do not wish to ignore aborted
+				 * subtransactions, which must cause FATAL, currently.
 				 */
-				if (!IsTransactionOrTransactionBlock())
+				if (IsAbortedTransactionBlockState())
 					return;
 
 				/*
-				 * If we can abort just the current subtransaction then we are
-				 * OK to throw an ERROR to resolve the conflict. Otherwise
-				 * drop through to the FATAL case.
-				 *
-				 * XXX other times that we can throw just an ERROR *may* be
-				 * PROCSIG_RECOVERY_CONFLICT_LOCK if no locks are held in
-				 * parent transactions
-				 *
-				 * PROCSIG_RECOVERY_CONFLICT_SNAPSHOT if no snapshots are held
-				 * by parent transactions and the transaction is not
-				 * transaction-snapshot mode
-				 *
-				 * PROCSIG_RECOVERY_CONFLICT_TABLESPACE if no temp files or
-				 * cursors open in parent transactions
+				 * If a recovery conflict happens while we are waiting for
+				 * input from the client, the client is presumably just
+				 * sitting idle in a transaction, preventing recovery from
+				 * making progress.  We'll drop through to the FATAL case
+				 * below to dislodge it, in that case.
 				 */
-				if (!IsSubTransaction())
+				if (!DoingCommandRead)
 				{
-					/*
-					 * If we already aborted then we no longer need to cancel.
-					 * We do this here since we do not wish to ignore aborted
-					 * subtransactions, which must cause FATAL, currently.
-					 */
-					if (IsAbortedTransactionBlockState())
+					/* Avoid losing sync in the FE/BE protocol. */
+					if (QueryCancelHoldoffCount != 0)
+					{
+						/*
+						 * Re-arm and defer this interrupt until later.  See
+						 * similar code in ProcessInterrupts().
+						 */
+						RecoveryConflictPendingReasons[reason] = true;
+						RecoveryConflictPending = true;
+						InterruptPending = true;
 						return;
+					}
 
-					RecoveryConflictPending = true;
-					QueryCancelPending = true;
-					InterruptPending = true;
+					/*
+					 * We are cleared to throw an ERROR.  We have a top-level
+					 * transaction that we can abort and a conflict that isn't
+					 * inherently non-retryable.
+					 */
+					LockErrorCleanup();
+					pgstat_report_recovery_conflict(reason);
+					ereport(ERROR,
+							(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+							 errmsg("canceling statement due to conflict with recovery"),
+							 errdetail_recovery_conflict(reason)));
 					break;
 				}
+			}
 
-				/* Intentional fall through to session cancel */
-				/* FALLTHROUGH */
-
-			case PROCSIG_RECOVERY_CONFLICT_DATABASE:
-				RecoveryConflictPending = true;
-				ProcDiePending = true;
-				InterruptPending = true;
-				break;
+			/* Intentional fall through to session cancel */
+			/* FALLTHROUGH */
 
-			default:
-				elog(FATAL, "unrecognized conflict mode: %d",
-					 (int) reason);
-		}
+		case PROCSIG_RECOVERY_CONFLICT_DATABASE:
 
-		Assert(RecoveryConflictPending && (QueryCancelPending || ProcDiePending));
+			/*
+			 * Retrying is not possible because the database is dropped, or we
+			 * decided above that we couldn't resolve the conflict with an
+			 * ERROR and fell through.  Terminate the session.
+			 */
+			pgstat_report_recovery_conflict(reason);
+			ereport(FATAL,
+					(errcode(reason == PROCSIG_RECOVERY_CONFLICT_DATABASE ?
+							 ERRCODE_DATABASE_DROPPED :
+							 ERRCODE_T_R_SERIALIZATION_FAILURE),
+					 errmsg("terminating connection due to conflict with recovery"),
+					 errdetail_recovery_conflict(reason),
+					 errhint("In a moment you should be able to reconnect to the"
+							 " database and repeat your command.")));
+			break;
 
-		/*
-		 * All conflicts apart from database cause dynamic errors where the
-		 * command or transaction can be retried at a later point with some
-		 * potential for success. No need to reset this, since non-retryable
-		 * conflict errors are currently FATAL.
-		 */
-		if (reason == PROCSIG_RECOVERY_CONFLICT_DATABASE)
-			RecoveryConflictRetryable = false;
+		default:
+			elog(FATAL, "unrecognized conflict mode: %d", (int) reason);
 	}
+}
+
+/*
+ * Check each possible recovery conflict reason.
+ */
+static void
+ProcessRecoveryConflictInterrupts(void)
+{
+	ProcSignalReason reason;
 
 	/*
-	 * Set the process latch. This function essentially emulates signal
-	 * handlers like die() and StatementCancelHandler() and it seems prudent
-	 * to behave similarly as they do.
+	 * We don't need to worry about joggling the elbow of proc_exit, because
+	 * proc_exit_prepare() holds interrupts, so ProcessInterrupts() won't call
+	 * us.
 	 */
-	SetLatch(MyLatch);
+	Assert(!proc_exit_inprogress);
+	Assert(InterruptHoldoffCount == 0);
+	Assert(RecoveryConflictPending);
 
-	errno = save_errno;
+	RecoveryConflictPending = false;
+
+	for (reason = PROCSIG_RECOVERY_CONFLICT_FIRST;
+		 reason <= PROCSIG_RECOVERY_CONFLICT_LAST;
+		 reason++)
+	{
+		if (RecoveryConflictPendingReasons[reason])
+		{
+			RecoveryConflictPendingReasons[reason] = false;
+			ProcessRecoveryConflictInterrupt(reason);
+		}
+	}
 }
 
 /*
@@ -3222,24 +3273,6 @@ ProcessInterrupts(void)
 			 */
 			proc_exit(1);
 		}
-		else if (RecoveryConflictPending && RecoveryConflictRetryable)
-		{
-			pgstat_report_recovery_conflict(RecoveryConflictReason);
-			ereport(FATAL,
-					(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-					 errmsg("terminating connection due to conflict with recovery"),
-					 errdetail_recovery_conflict()));
-		}
-		else if (RecoveryConflictPending)
-		{
-			/* Currently there is only one non-retryable recovery conflict */
-			Assert(RecoveryConflictReason == PROCSIG_RECOVERY_CONFLICT_DATABASE);
-			pgstat_report_recovery_conflict(RecoveryConflictReason);
-			ereport(FATAL,
-					(errcode(ERRCODE_DATABASE_DROPPED),
-					 errmsg("terminating connection due to conflict with recovery"),
-					 errdetail_recovery_conflict()));
-		}
 		else if (IsBackgroundWorker)
 			ereport(FATAL,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
@@ -3282,31 +3315,13 @@ ProcessInterrupts(void)
 				 errmsg("connection to client lost")));
 	}
 
-	/*
-	 * If a recovery conflict happens while we are waiting for input from the
-	 * client, the client is presumably just sitting idle in a transaction,
-	 * preventing recovery from making progress.  Terminate the connection to
-	 * dislodge it.
-	 */
-	if (RecoveryConflictPending && DoingCommandRead)
-	{
-		QueryCancelPending = false; /* this trumps QueryCancel */
-		RecoveryConflictPending = false;
-		LockErrorCleanup();
-		pgstat_report_recovery_conflict(RecoveryConflictReason);
-		ereport(FATAL,
-				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-				 errmsg("terminating connection due to conflict with recovery"),
-				 errdetail_recovery_conflict(),
-				 errhint("In a moment you should be able to reconnect to the"
-						 " database and repeat your command.")));
-	}
-
 	/*
 	 * Don't allow query cancel interrupts while reading input from the
 	 * client, because we might lose sync in the FE/BE protocol.  (Die
 	 * interrupts are OK, because we won't read any further messages from the
 	 * client in that case.)
+	 *
+	 * See similar logic in ProcessRecoveryConflictInterrupts().
 	 */
 	if (QueryCancelPending && QueryCancelHoldoffCount != 0)
 	{
@@ -3365,16 +3380,6 @@ ProcessInterrupts(void)
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling autovacuum task")));
 		}
-		if (RecoveryConflictPending)
-		{
-			RecoveryConflictPending = false;
-			LockErrorCleanup();
-			pgstat_report_recovery_conflict(RecoveryConflictReason);
-			ereport(ERROR,
-					(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-					 errmsg("canceling statement due to conflict with recovery"),
-					 errdetail_recovery_conflict()));
-		}
 
 		/*
 		 * If we are reading a command from the client, just ignore the cancel
@@ -3390,6 +3395,9 @@ ProcessInterrupts(void)
 		}
 	}
 
+	if (RecoveryConflictPending)
+		ProcessRecoveryConflictInterrupts();
+
 	if (IdleInTransactionSessionTimeoutPending)
 	{
 		/*
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 905af2231b..6ef7298294 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -38,12 +38,14 @@ typedef enum
 	PROCSIG_PARALLEL_APPLY_MESSAGE, /* Message from parallel apply workers */
 
 	/* Recovery conflict reasons */
-	PROCSIG_RECOVERY_CONFLICT_DATABASE,
+	PROCSIG_RECOVERY_CONFLICT_FIRST,
+	PROCSIG_RECOVERY_CONFLICT_DATABASE = PROCSIG_RECOVERY_CONFLICT_FIRST,
 	PROCSIG_RECOVERY_CONFLICT_TABLESPACE,
 	PROCSIG_RECOVERY_CONFLICT_LOCK,
 	PROCSIG_RECOVERY_CONFLICT_SNAPSHOT,
 	PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
 	PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+	PROCSIG_RECOVERY_CONFLICT_LAST = PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
 
 	NUM_PROCSIGNALS				/* Must be last! */
 } ProcSignalReason;
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index abd7b4fff3..ab43b638ee 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -70,8 +70,7 @@ extern void die(SIGNAL_ARGS);
 extern void quickdie(SIGNAL_ARGS) pg_attribute_noreturn();
 extern void StatementCancelHandler(SIGNAL_ARGS);
 extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
-extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
-																 * handler */
+extern void HandleRecoveryConflictInterrupt(ProcSignalReason reason);
 extern void ProcessClientReadInterrupt(bool blocked);
 extern void ProcessClientWriteInterrupt(bool blocked);
 
-- 
2.39.2

Reply via email to