Hello > This is safe because replay is frozen at this > point: the only ways out of the pause are promotion and shutdown, so no > transaction's commit status can change afterwards, and any transaction a > query finds committed in CLOG necessarily committed before that query's > snapshot.
But if I look at the documentation, after shutdown it allows a restart
with a later recovery target:
> The intended use of the pause setting is to allow queries to be executed
> against the database to check if this recovery target is the most desirable
> point for recovery. The paused state can be resumed by using
> pg_wal_replay_resume()
> (see Table 9.81), which then causes recovery to end. If this recovery target
> is
> not the desired stopping point, then shut down the server, change the recovery
> target settings to a later target and restart to continue recovery.
"so no transaction's commit status can change after this point" is
true within the lifetime of the paused instance, but if I shut down
and restart the server with a later recovery target?
Even a read-only query can mark a tuple with HEAP_XMIN_INVALID if
HeapTupleSatisfiesMVCC decides that a transaction aborted or crashed.
And then in bufmgr.c:MarkSharedBufferDirtyHint, we can see the
following conditions that prevent this change from being flushed with
an early return:
if (XLogHintBitIsNeeded() && (lockstate & BM_PERMANENT))
{
/*
* If we must not write WAL, due to a relfilelocator-specific
* condition or being in recovery, don't dirty the page. We can
* set the hint, just not dirty the page as a result so the hint
* is lost when we evict the page or shutdown.
*
* See src/backend/storage/page/README for longer discussion.
*/
if (RecoveryInProgress() ||
RelFileLocatorSkippingWAL(BufTagGetRelFileLocator(&bufHdr->tag)))
return;
...
Where
#define XLogHintBitIsNeeded() (wal_log_hints || DataChecksumsNeedWrite())
So if we turn off both wal_log_hints and data checksums, that return
disappears, and we can cause data corruption with just a select in a
paused state with the patch.
See the attached tap test that showcases the problem.
subxid_corruption.pl
Description: Binary data
