Hi Hackers, I am considering implementing RPO (recovery point objective) enforcement feature for Postgres where the WAL writes on the primary are stalled when the WAL distance between the primary and standby exceeds the configured (replica_lag_in_bytes) threshold. This feature is useful particularly in the disaster recovery setups where primary and standby are in different regions and synchronous replication can't be set up for latency and performance reasons yet requires some level of RPO enforcement.
The idea here is to calculate the lag between the primary and the standby (Async?) server during XLogInsert and block the caller until the lag is less than the threshold value. We can calculate the max lag by iterating over ReplicationSlotCtl->replication_slots. If this is not something we don't want to do in the core, at least adding a hook for XlogInsert is of great value. A few other scenarios I can think of with the hook are: 1. Enforcing RPO as described above 2. Enforcing rate limit and slow throttling when sync standby is falling behind (could be flush lag or replay lag) 3. Transactional log rate governance - useful for cloud providers to provide SKU sizes based on allowed WAL writes. Thoughts? Thanks, Satya