On 2020/09/08 13:23, Ian Barwick wrote:
On 2020/09/08 13:11, Andres Freund wrote:
Hi,
On 2020-09-08 13:03:01 +0900, Ian Barwick wrote:
(...)
I wonder if it's possible to increment "xactCompletionCount"
during replay along these lines:
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
*************** xact_redo_commit(xl_xact_parsed_commit *
*** 5915,5920 ****
--- 5915,5924 ----
*/
if (XactCompletionApplyFeedback(parsed->xinfo))
XLogRequestWalReceiverReply();
+
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ ShmemVariableCache->xactCompletionCount++;
+ LWLockRelease(ProcArrayLock);
}
which seems to work (though quite possibly I've overlooked something I don't
know that I don't know about and it will all break horribly somewhere,
etc. etc.).
We'd also need the same in a few more places. Probably worth looking at
the list where we increment it on the primary (particularly we need to
also increment it for aborts, and 2pc commit/aborts).
Yup.
At first I was very confused as to why none of the existing tests have
found this significant issue. But after thinking about it for a minute
that's because they all use psql, and largely separate psql invocations
for each query :(. Which means that there's no cached snapshot around...
Do you want to try to write a patch?
Sure, I'll give it a go as I have some time right now.
Attached, though bear in mind I'm not very familiar with parts of this,
particularly 2PC stuff, so consider it educated guesswork.
Regards
Ian Barwick
--
Ian Barwick https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
commit 544e2b1661413fe08e3083f03063c12c0d7cf3aa
Author: Ian Barwick <i...@2ndquadrant.com>
Date: Tue Sep 8 12:24:14 2020 +0900
Fix snapshot caching on standbys
Addresses issue introduced in 623a9ba.
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index b8bedca04a..227d03bbce 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3320,6 +3320,14 @@ multixact_redo(XLogReaderState *record)
}
else
elog(PANIC, "multixact_redo: unknown op code %u", info);
+
+ /*
+ * Advance xactCompletionCount so rebuilds of snapshot contents
+ * can be triggered.
+ */
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ ShmemVariableCache->xactCompletionCount++;
+ LWLockRelease(ProcArrayLock);
}
Datum
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index af6afcebb1..04ca858918 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5915,6 +5915,14 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
*/
if (XactCompletionApplyFeedback(parsed->xinfo))
XLogRequestWalReceiverReply();
+
+ /*
+ * Advance xactCompletionCount so rebuilds of snapshot contents
+ * can be triggered.
+ */
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ ShmemVariableCache->xactCompletionCount++;
+ LWLockRelease(ProcArrayLock);
}
/*
@@ -5978,6 +5986,14 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
/* Make sure files supposed to be dropped are dropped */
DropRelationFiles(parsed->xnodes, parsed->nrels, true);
+
+ /*
+ * Advance xactCompletionCount so rebuilds of snapshot contents
+ * can be triggered.
+ */
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ ShmemVariableCache->xactCompletionCount++;
+ LWLockRelease(ProcArrayLock);
}
void