On Thu, Sep 1, 2022 at 3:08 PM Kyotaro Horiguchi
<[email protected]> wrote:
> At Thu, 1 Sep 2022 12:05:36 +1200, Thomas Munro <[email protected]>
> wrote in
> > On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby <[email protected]> wrote:
> > > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting:
> > > end-of-recovery immediate wait
> > > < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of
> > > generated WAL; request 1201/1CAF84F0, current position 1201/1CADB730
> > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation
> > > base/16881/2840_vm
> > > < 2022-08-31 08:44:10.609 CDT >ERROR: xlog flush request 1201/1CAF84F0
> > > is not satisfied --- flushed only to 1201/1CADB730
> > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation
> > > base/16881/2840_vm
> > > < 2022-08-31 08:44:10.609 CDT >FATAL: checkpoint request failed
> > >
> > > I was able to start it with -c recovery_prefetch=no, so it seems like
> > > prefetch tried to do too much. The VM runs centos7 under qemu.
> > > I'm making a copy of the data dir in cases it's needed.
>
> Just for information, there was a fixed bug about
> overwrite-aborted-contrecord feature, which causes this kind of
> failure (xlog flush request exceeds insertion bleeding edge). If it is
> that, it has been fixed by 6672d79139 two-days ago.
Hmm. Justin, when you built from source, which commit were you at?
If it's REL_15_BETA3, any chance you could cherry pick that change and
check what happens? And without that, could you show what this logs
for good and bad recovery settings?
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 248a40e8fa..1e435e81d5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5308,6 +5308,7 @@ StartupXLOG(void)
*/
endOfRecoveryInfo = FinishWalRecovery();
EndOfLog = endOfRecoveryInfo->endOfLog;
+elog(LOG, "XXX point 1: EndOfLog = %lx", EndOfLog);
EndOfLogTLI = endOfRecoveryInfo->endOfLogTLI;
abortedRecPtr = endOfRecoveryInfo->abortedRecPtr;
missingContrecPtr = endOfRecoveryInfo->missingContrecPtr;
@@ -5446,7 +5447,9 @@ StartupXLOG(void)
{
Assert(!XLogRecPtrIsInvalid(abortedRecPtr));
EndOfLog = missingContrecPtr;
+elog(LOG, "XXX clobbering EndOfLog");
}
+elog(LOG, "XXX point 2: EndOfLog = %lx", EndOfLog);
/*
* Prepare to write WAL starting at EndOfLog location, and init xlog
@@ -5456,6 +5459,7 @@ StartupXLOG(void)
Insert = &XLogCtl->Insert;
Insert->PrevBytePos = XLogRecPtrToBytePos(endOfRecoveryInfo->lastRec);
Insert->CurrBytePos = XLogRecPtrToBytePos(EndOfLog);
+elog(LOG, "XXX point 3: Insert->CurrBytePos = %lx", Insert->CurrBytePos);
/*
* Tricky point here: lastPage contains the *last* block that the LastRec
@@ -5554,6 +5558,7 @@ StartupXLOG(void)
Insert->fullPageWrites = lastFullPageWrites;
UpdateFullPageWrites();
+elog(LOG, "XXX point 4: Insert->CurrBytePos = %lx", Insert->CurrBytePos);
/*
* Emit checkpoint or end-of-recovery record in XLOG, if required.
*/
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 141af04388..8c6753a8a8 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1434,6 +1434,8 @@ FinishWalRecovery(void)
XLogPrefetcherBeginRead(xlogprefetcher, lastRec);
(void) ReadRecord(xlogprefetcher, PANIC, false, lastRecTLI);
endOfLog = xlogreader->EndRecPtr;
+elog(LOG, "point 0: lastRec = %lx", lastRec);
+elog(LOG, "point 0: endOfLog = %lx", endOfLog);
/*
* Remember the TLI in the filename of the XLOG segment containing the