Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

Vitaly Davydov Thu, 05 Jun 2025 08:21:53 -0700

Dear Alexander, Amit

Alexander Korotkov wrote:
> Also, I've changed ReplicationSlotsComputeRequiredLSN() call to
> CheckPointReplicationSlots() to update required LSN after
> SaveSlotToPath() updated last_saved_restart_lsn.  This helps to pass
> checks in 001_stream_rep.pl without additional hacks.


Thank you for the improvement and patch preparation. I confirm the test is
passed without additional hacks now.

I sill do not understand why this solution is favored. It is, in my opinion,
a non backward-compatible solution. In any case, I'm ok to go with this patch.
If needed, I may prepare a backward-compatible solution where
last_saved_restart_lsn values will be in an another place of the shmem, rather
than in ReplicationSlot struct.

I still would like to add my 5 cents to the discussion.

The purpose of the xmin value is to prevent tuples from vacuuming. Slots'
restart_lsn values are used to calculate the oldest lsn to keep WAL segments
from removal in checkpoint. These processes are pretty independent.

The logical slots are advanced in 2 steps. At the first step, the logical
decoding stuff periodically sets consistent candidate values for catalog_xmin 
and
restart_lsn. At the second step, when LogicalConfirmReceivedLocation is called,
the candidate values are assigned on catalog_xmin and restart_lsn values based
on the confirmed lsn value. The slot is saved with these consistent values.
It is important, that the candidate values are consistent, decoding guarantees
it. In case of a crash, we should guarantee that the loaded from the disk
catalog_xmin and restart_lsn values are consistent and valid for logical slots.
LogicalConfirmReceivedLocation function keeps this consistency by updating them
from consistent candidate values in a single operation.

We have to guarantee that we use saved to disk values to calculate xmin horizon
and slots' oldest lsn. For this purpose, effective_catalog_xmin is used. We
update effective_catalog_xmin in LogicalConfirmReceivedLocation just
after saving slot to disk. Another place where we update effective_catalog_xmin
is when walsender receives hot standby feedback message.

Once, we have two independent processes (vacuuming, checkpoint), we can 
calculate
xmin horizon and oldest WAL lsn values independently (at different times) from
the saved to disk values. Note, these values are updated in a non atomic way.

The xmin value is set when the node receives hot standby feedback and it is used
to keep tuples from vacuuming as well as catalog_xmin for decoding stuff. Not
sure, xmin is applicable for logical replication.

The confirmed flush lsn is used as a startpoint when a peer node doesn't provide
the start lsn and to check that the start lsn is not older than the latest
confirmed flush lsn. The saving of the slot on disk at each call of
LogicalConfirmReceivedLocation doesn't help to avoid conflicts completely, but
it helps to decrease the probability of conflicts. So, i'm still not sure, we
need to save logical slots on each advance to avoid conflicts, because it
doesn't help in general. The conflicts should be resolved by other means.

Once, we truncate old wal segments in checkpoint only. I believe, it is ok if we
calculate the oldest lsn only at the beginning of the checkpoint, as it was in
the alternative solution. I think, we can update xmin horizon in checkpoint only
but the horizon advancing will be more lazy in this case.

Taking into account these thoughts, I can't see any problems with the 
alternative
patch where oldest wal lsn is calculated only in checkpoint.

With best regards,
Vitaly

Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

Reply via email to