On Tue, Jan 9, 2024 at 5:44 PM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > V58-0002 >
+static bool +synchronize_one_slot(WalReceiverConn *wrconn, RemoteSlot *remote_slot) { ... + /* Slot ready for sync, so sync it. */ + else + { + /* + * Sanity check: With hot_standby_feedback enabled and + * invalidations handled appropriately as above, this should never + * happen. + */ + if (remote_slot->restart_lsn < slot->data.restart_lsn) + elog(ERROR, + "cannot synchronize local slot \"%s\" LSN(%X/%X)" + " to remote slot's LSN(%X/%X) as synchronization" + " would move it backwards", remote_slot->name, + LSN_FORMAT_ARGS(slot->data.restart_lsn), + LSN_FORMAT_ARGS(remote_slot->restart_lsn)); ... } I was thinking about the above code in the patch and as far as I can think this can only occur if the same name slot is re-created with prior restart_lsn after the existing slot is dropped. Normally, the newly created slot (with the same name) will have higher restart_lsn but one can mimic it by copying some older slot by using pg_copy_logical_replication_slot(). I don't think as mentioned in comments even if hot_standby_feedback is temporarily set to off, the above shouldn't happen. It can only lead to invalidated slots on standby. To close the above race, I could think of the following ways: 1. Drop and re-create the slot. 2. Emit LOG/WARNING in this case and once remote_slot's LSN moves ahead of local_slot's LSN then we can update it; but as mentioned in your previous comment, we need to update all other fields as well. If we follow this then we probably need to have a check for catalog_xmin as well. Now, related to this the other case which needs some handling is what if the remote_slot's restart_lsn is greater than local_slot's restart_lsn but it is a re-created slot with the same name. In that case, I think the other properties like 'two_phase', 'plugin' could be different. So, is simply copying those sufficient or do we need to do something else as well? -- With Regards, Amit Kapila.