Hi Hackers,

When the standby couldn't connect to the primary it switches the XLog
source from streaming to archive and continues in that state until it can
get the WAL from the archive location. On a server with high WAL activity,
typically getting the WAL from the archive is slower than streaming it from
the primary and couldn't exit from that state. This not only increases the
lag on the standby but also adversely impacts the primary as the WAL gets
accumulated, and vacuum is not able to collect the dead tuples. DBAs as a
mitigation can however remove/advance the slot or remove the
restore_command on the standby but this is a manual work I am trying to
avoid. I would like to propose the following, please let me know your
thoughts.

   - Automatically attempt to switch the source from Archive to streaming
   when the primary_conninfo is set after replaying 'N' wal segment governed
   by the GUC retry_primary_conn_after_wal_segments
   - when  retry_primary_conn_after_wal_segments is set to -1 then the
   feature is disabled
   - When the retry attempt fails, then switch back to the archive

Thanks,
Satya

Reply via email to