Ensure that the sync slots reach a consistent state after promotion without losing data.
We were directly copying the LSN locations while syncing the slots on the standby. Now, it is possible that at some particular restart_lsn there are some running xacts, which means if we start reading the WAL from that location after promotion, we won't reach a consistent snapshot state at that point. However, on the primary, we would have already been in a consistent snapshot state at that restart_lsn so we would have just serialized the existing snapshot. To avoid this problem we will use the advance_slot functionality unless the snapshot already exists at the synced restart_lsn location. This will help us to ensure that snapbuilder/slot statuses are updated properly without generating any changes. Note that the synced slot will remain as RS_TEMPORARY till the decoding from corresponding restart_lsn can reach a consistent snapshot state after which they will be marked as RS_PERSISTENT. Per buildfarm Author: Hou Zhijie Reviewed-by: Bertrand Drouvot, Shveta Malik, Bharath Rupireddy, Amit Kapila Discussion: https://postgr.es/m/os0pr01mb5716b3942ae49f3f725aca9294...@os0pr01mb5716.jpnprd01.prod.outlook.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/2ec005b4e29740f0d36e6646d149af192328b2ff Modified Files -------------- src/backend/replication/logical/logical.c | 147 ++++++++++++++++++++- src/backend/replication/logical/slotsync.c | 133 +++++++++++++------ src/backend/replication/logical/snapbuild.c | 23 ++++ src/backend/replication/slotfuncs.c | 118 +---------------- src/include/replication/logical.h | 2 + src/include/replication/snapbuild.h | 2 + .../recovery/t/040_standby_failover_slots_sync.pl | 97 +++++++++++--- 7 files changed, 351 insertions(+), 171 deletions(-)