On Tue, Dec 16, 2025 at 9:54 AM Joao Foltran <[email protected]> wrote: > > Thank you for clarifying this behavior to me! I've tested it and it > really doesn't hold back wals anymore once it has been invalidated due > to the check inside ReplicationSlotsComputeRequiredLSN(). > > You are correct that simply letting the slot be reacquired and > continue working would be dangerous leading to possibly losing WALs. > Can we then check if the standby was able to reconnect and start > streaming successfully and then change the slots information for it to > be considered inside ReplicationSlotsComputeRequiredLSN() again? > > Example: > > in XLogSendPhysical(), after we seen that the first record was sent: > > // In XLogSendPhysical() after XLogReadRecord() succeeds > if (first_record_sent && > MyReplicationSlot && > SlotIsPhysical(MyReplicationSlot) && > MyReplicationSlot->data.invalidated != RS_INVAL_NONE) > { > // Clear invalidation - we successfully read WAL > } > > This would clear the invalidation only after we know for sure that it > can continue streaming wals without problem. >
The slots could be invalidated due to other reasons like RS_INVAL_IDLE_TIMEOUT as well. It doesn't sound like a good to clear the invalidation flag of the slot because tomorrow we could decide to invalidate due to other reasons as well. I think it would be better to do the required forensic with invalid slots and re-create the slot if we want to retain the required WAL. Why don't you prefer to re-create it once the slot is invalidated? -- With Regards, Amit Kapila.
