Hi,
I had another off-list discussion with Fujii-san, and according to the
following manual[1], it seems that a transaction with an overflowed
subtransaction is already considered inconsistent:
Reaching a consistent state can also be delayed in the presence of
both of these conditions:
- A write transaction has more than 64 subtransactions
- Very long-lived write transactions
IIUC, the manual suggests that both conditions must be met -- recovery
reaching at least minRecoveryPoint and no overflowed subtransactions —-
for the standby to be considered consistent.
OTOH, the following log message is emitted even when subtransactions
have overflowed, which appears to contradict the definition of
consistency mentioned above:
LOG: consistent recovery state reached
This log message is triggered when recovery progresses beyond
minRecoveryPoint(according to CheckRecoveryConsistency()).
However, since this state does not satisfy 'consistency' defined in the
manual, I think it would be more accurate to log that it has merely
reached the "minimum recovery point".
Furthermore, it may be better to emit the above log message only when
recovery has progressed beyond minRecoveryPoint and there are no
overflowed subtransactions.
Attached patch does this.
Additionally, renaming variables such as reachedConsistency in
CheckRecoveryConsistency might also be appropriate.
However, in the attached patch, I have left them unchanged for now.
On 2025-03-25 00:55, Fujii Masao wrote:
- case CAC_NOTCONSISTENT:
+ case CAC_NOTCONSISTENT_OR_OVERFLOWED:
This new name seems a bit too long. I'm OK to leave the name as it is.
Or, something like CAC_NOTHOTSTANDBY seems simpler and better to me.
Beyond just the length issue, given the understanding outlined above, I
now think CAC_NOTCONSISTENT does not actually need to be changed.
In high-availability.sgml, the "Administrator's Overview" section
already
describes the conditions for accepting hot standby connections.
This section should also be updated accordingly.
Agreed.
I have updated this section to mention that the resolution is to close
the problematic transaction.
OTOH the changes made in v2 patch seem unnecessary, since the concept of
'consistent' is already explained in the "Administrator's Overview."
- errdetail("Consistent recovery state has not been yet
reached.")));
+ errdetail("Consistent recovery state has not been yet reached,
or snappshot is pending because subtransaction is overflowed."),
Given the above understanding, "or" is not appropriate in this context,
so I left this message unchanged.
Instead, I have added an errhint. The phrasing in the hint message
aligns with the manual, allowing users to search for this hint and find
the newly added resolution instructions.
What do you think?
[1] https://www.postgresql.org/docs/devel/hot-standby.html
--
Regards,
--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.
From 2f552c683cfc3f4ba69a33f279ec80bca60e1c93 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi <torikos...@oss.nttdata.com>
Date: Thu, 27 Mar 2025 10:51:37 +0900
Subject: [PATCH v3] Add hint message when hot standby is unaccessible
Currently, when hot standby is inaccessible due to an overflowed
subtransaction, it is difficult for users to determine the cause
since there are no user level log message indicating that.
This patch adds a hint message to indicate the reasons.
Additionally, there is an inconsistency between the documentation and
the log messages regarding the definition of 'consistent' recovery.
The documentation states that a consistent state requires both recovery
goes beyond minRecoveryPoint and the absence of overflowed
subtransactions, whereas the source code consideres only
minRecoveryPoint.
This patch updates the log message to align with the documentation.
---
doc/src/sgml/high-availability.sgml | 1 +
src/backend/access/transam/xlogrecovery.c | 5 ++++-
src/backend/tcop/backend_startup.c | 4 +++-
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index acf3ac0601..6ceb57b0a0 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1991,6 +1991,7 @@ LOG: database system is ready to accept read-only connections
</listitem>
</itemizedlist>
+ The former case can be resolved by closing the transaction.
If you are running file-based log shipping ("warm standby"), you might need
to wait until the next WAL file arrives, which could be as long as the
<varname>archive_timeout</varname> setting on the primary.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 2c19013c98..8152f90c99 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2249,7 +2249,7 @@ CheckRecoveryConsistency(void)
reachedConsistency = true;
ereport(LOG,
- (errmsg("consistent recovery state reached at %X/%X",
+ (errmsg("minimum recovery point reached at %X/%X",
LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
}
@@ -2268,6 +2268,9 @@ CheckRecoveryConsistency(void)
SpinLockRelease(&XLogRecoveryCtl->info_lck);
LocalHotStandbyActive = true;
+ ereport(LOG,
+ (errmsg("consistent recovery state reached at %X/%X",
+ LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
SendPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY);
}
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index 27c0b3c2b0..252443f4ca 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -311,7 +311,9 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("the database system is not yet accepting connections"),
- errdetail("Consistent recovery state has not been yet reached.")));
+ errdetail("Consistent recovery state has not been yet reached."),
+ errhint("Minimum recovery point has not been yet reached or a write transaction may have more than %d subtransactions.",
+ PGPROC_MAX_CACHED_SUBXIDS)));
else
ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
base-commit: c325a7633fcb33dbd73f46ddbbe91e95ddf3b227
--
2.48.1