On Tue, Aug 12, 2025 at 10:24 PM px shi <spxlyy...@gmail.com> wrote:

> How often does your primary node crash, and then not recover due to WALs
>> corruption or WALs not existing?
>>
>> If it's _ever_ happened, you should _fix that_ instead of rolling your
>> own WAL archival.process.
>>
>
>  I once encountered a case where the recovery process failed to restore to
> the latest LSN due to missing WAL files in the archive. The root cause was
> multiple failovers between primary and standby. During one of the
> switchovers, the primary crashed before completing the archiving of all WAL
> files. When the standby was promoted to primary, it began archiving WAL
> files for the new timeline, resulting in a gap between the WAL files of the
> two timelines. Moreover, no base backup was taken during this period.
>
>

I am not sure what the problem is  here either, other than something
seriously wrong with configuration with PostgreSQL and PgBackrest.

The replica should be receiving the WAL via a replication slot using
Streaming.  Meaning the primary will keep the WAL until the replica is
caught up,  if the replica becomes disconnected due to
max_slot_wal_keep_size aka wal_keep_segments  is exceeded the replicas
recovery_command can take offer and fetch from the WAL Archive to catch the
replica up.  This assumes hot_feedback is on so the WAL replay won't become
delayed due to snapshot locks on the replica.

If  all the above is true the replica should never lag behind unless the
disk IO layer is way undersized compared to the Primary.  S3 is being
talked about  so it makes me wonder about DISK IO configuration on the
primary vs the replica.  I see this causing lag under high load where the
replica IO layer is the bottleneck.

If PgBackrest can't keep up with WAL archiving, as others have stated  you
need to configure Asynchronous Archiving. The number of workers depends on
the load. I have a server running 8 parallel workers to archive 1TB of WAL
daily....  And another server  during maintenance tasks generates around
10,000 WAL files in about 2 hours using 6 PgBAckrest workers  All to S3
buckets.

The above statement makes me wonder if there is some kind of High
Availability monitor running like pg_autofailover, that is promoting a
replica then  converting the former primary to a replica of the recently
"promoted replica"

If the above matches to what is happening, it is very easy to mess up the
configuration for WAL archiving and backups. Part of the process of
promoting a replica is to make sure WAL archiving is working.  The replica
after being promoted immediately kicks of autovacuum to rebuild things like
FSM which generates a lot of WAL files.

If you are losing  WAL files the configuration is wrong somewhere..

Just not enough information on the series of events and the configuration
to tell what the root cause is other than miss-configuration.


Thanks
Justin

Reply via email to