On Wed, Jul 30, 2025 at 12:00 AM Doruk Yilmaz <do...@mixrank.com> wrote: > > On Mon, Jul 29, 2025 at 8:13 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > That is true but I still feel there has to be some mechanism where we > > can catch and give an ERROR to the user, if it doesn't follow the > > same. For example, pg_replication_origin_advance() always allows going > > backwards in terms of LSN which means if one doesn't follow commit > > order, it can lead to breaking the replication as after restart the > > client can ask to start replication from some prior point. > If you have any ideas for safeguards or API changes, I'd be happy to > help implement them or discuss them. > > Can you tell us the use case? Did you also intend to use it for parallel > > apply, if so, can you also tell at a high > > level, how you are planning to manage origin? > Yes, we use it for parallel apply. We have a custom logical > replication system that applies changes using multiple worker > processes, each with their own database connection. > Our use case requires multiple connections to be able to advance the > same replication origin. >
How do you advance the origin? Did you use pg_replication_origin_advance()? If so, you should be aware that it can be used for initial setup; see comment in that API code: "Can't sensibly pass a local commit to be flushed at checkpoint - this xact hasn't committed yet. This is why this function should be used to set up the initial replication state, but not for replay." I wonder if you are using pg_replication_origin_advance(), won't its current implementation has the potential to cause a problem for your usecase? I think the problem it can cause is it may miss a transaction to apply after restart because we can use remote_lsn without a corresponding transaction (local_lsn) flushed on the subscriber. This can happen because ideally we want the transaction that is not successfully flushed to be replayed after restart. In general, I was thinking of adding a restriction pg_replication_origin_advance() such that it gives an ERROR when a user tries to move remote_lsn backward unless requested explicitly. It would be good to know the opinion of others involved in the original change of maintaining commit order for parallel apply of large transactions. -- With Regards, Amit Kapila.