On Tue, Feb 13, 2024 at 9:38 AM Zhijie Hou (Fujitsu)
<[email protected]> wrote:
>
> Here is the V85_2 patch set that added the test and fixed one typo,
> there are no other code changes.
>
Few comments on the latest changes:
==============================
1.
+# Confirm that the invalidated slot has been dropped.
+$standby1->wait_for_log(qr/dropped replication slot "lsub1_slot" of dbid 5/,
+ $log_offset);
Is it okay to hardcode dbid 5? I am a bit worried that it can lead to
instability in the test.
2.
+check_primary_info(WalReceiverConn *wrconn, int elevel)
+{
..
+ bool primary_info_valid;
I don't think for 0001, we need an elevel as an argument, so let's
remove it. Additionally, can we change the variable name
primary_info_valid to primary_slot_valid? Also, can we change the
function name to validate_remote_info() as the remote can be both
primary or standby?
3.
+SyncReplicationSlots(WalReceiverConn *wrconn)
+{
+ PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn));
+ {
+ check_primary_info(wrconn, ERROR);
+
+ synchronize_slots(wrconn);
+ }
+ PG_END_ENSURE_ERROR_CLEANUP(slotsync_failure_callback,
PointerGetDatum(wrconn));
+
+ walrcv_disconnect(wrconn);
It is better to disconnect in the caller where we have made the connection.
--
With Regards,
Amit Kapila.
diff --git a/doc/src/sgml/logicaldecoding.sgml
b/doc/src/sgml/logicaldecoding.sgml
index bfa2a13377..a9ff6acd31 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -365,19 +365,18 @@ postgres=# select * from
pg_logical_slot_get_changes('regression_slot', NULL, NU
<title>Replication Slot Synchronization</title>
<para>
The logical replication slots on the primary can be synchronized to
- the hot standby by enabling <literal>failover</literal> during slot
- creation (e.g., using the <literal>failover</literal> parameter of
+ the hot standby by using the <literal>failover</literal> parameter of
<link linkend="pg-create-logical-replication-slot">
<function>pg_create_logical_replication_slot</function></link>, or
using the <link linkend="sql-createsubscription-params-with-failover">
<literal>failover</literal></link> option of
- <command>CREATE SUBSCRIPTION</command>), and then calling
+ <command>CREATE SUBSCRIPTION</command> during slot creation, and then
calling
<link linkend="pg-sync-replication-slots">
<function>pg_sync_replication_slots</function></link>
on the standby. For the synchronization to work, it is mandatory to
- have a physical replication slot between the primary and the standby
(e.g.,
+ have a physical replication slot between the primary and the standby aka
<link
linkend="guc-primary-slot-name"><varname>primary_slot_name</varname></link>
- should be configured on the standby), and
+ should be configured on the standby, and
<link
linkend="guc-hot-standby-feedback"><varname>hot_standby_feedback</varname></link>
must be enabled on the standby. It is also necessary to specify a valid
<literal>dbname</literal> in the
diff --git a/src/backend/replication/logical/slotsync.c
b/src/backend/replication/logical/slotsync.c
index feb04e1451..eb238cef92 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -304,6 +304,12 @@ reserve_wal_for_local_slot(XLogRecPtr restart_lsn)
* number. However, if no WAL segment files have been removed
by a
* checkpoint since startup, we need to search for the oldest
segment
* file from the current timeline existing in XLOGDIR.
+ *
+ * XXX: Currently, we are searching for the oldest segment in
the
+ * current timeline as there is less chance of the slot's
restart_lsn
+ * from being some prior timeline, and even if it happens, in
the worst
+ * case, we will wait to sync till the slot's restart_lsn moved
to the
+ * current timeline.
*/
oldest_segno = XLogGetLastRemovedSegno() + 1;
@@ -685,11 +691,10 @@ synchronize_slots(WalReceiverConn *wrconn)
}
/*
- * Checks the primary server info.
+ * Checks the remote server info.
*
- * Using the specified primary server connection, check whether we are a
- * cascading standby. It also validates primary_slot_name for non-cascading
- * standbys.
+ * We ensure that the 'primary_slot_name' exists on the remote server and
+ * the remote server is not a standby node.
*/
static void
check_primary_info(WalReceiverConn *wrconn, int elevel)