On Tue, Aug 12, 2025 at 9:22 PM Константин Книжник <knizh...@garret.ru> wrote: > > Hi, > This is something similar to what I have in mind when starting my experiments > with LR apply speed improvements. I think that maintaining a full > (RelationId, ReplicaIdentity) hash may be too expensive - there can be > hundreds of active transactions updating millions of rows. > I thought about something like a bloom filter. But frankly speaking I didn't > go far in thinking about all implementation details. Your proposal is much > more concrete. >
We can surely investigate a different hash_key if that works for all cases. > But I decided to implement first approach with prefetch, which is much more > simple, similar with prefetching currently used for physical replication and > still provide quite significant improvement: > https://www.postgresql.org/message-id/flat/84ed36b8-7d06-4945-9a6b-3826b3f999a6%40garret.ru#70b45c44814c248d3d519a762f528753 > > There is one thing which I do not completely understand with your proposal: > do you assume that LR walsender at publisher will use reorder buffer to > "serialize" transactions > or you assume that streaming mode will be used (now it is possible to enforce > parallel apply of short transactions using > `debug_logical_replication_streaming`)? > The current proposal is based on reorderbuffer serializing transactions as we are doing now. > It seems to be senseless to spend time and memory trying to serialize > transactions at the publisher if we in any case want to apply them in > parallel at subscriber. > But then there is another problem: at publisher there can be hundreds of > concurrent active transactions (limited only by `max_connections`) which > records are intermixed in WAL. > If we try to apply them concurrently at subscriber, we need a corresponding > number of parallel apply workers. But usually the number of such workers is > less than 10 (and default is 2). > So looks like we need to serialize transactions at subscriber side. > > Assume that there are 100 concurrent transactions T1..T100, i.e. before first > COMMIT record there are mixed records of 100 transactions. > And there are just two parallel apply workers W1 and W2. Main LR apply worker > with send T1 record to W1, T2 record to W2 and ... there are not more vacant > workers. > It has either to spawn additional ones, but it is not always possible because > total number of background workers is limited. > Either serialize all other transactions in memory or on disk, until it > reaches COMMIT of T1 or T2. > I afraid that such serialization will eliminate any advantages of parallel > apply. > Right, I also think so and we will probably end up doing something what we are doing now in publisher. > Certainly if we do reordering of transactions at publisher side, then there > is no such problem. Subscriber receives all records for T1, then all records > for T2, ... If there are no more vacant workers, it can just wait until any > of this transactions is completed. But I am afraid that in this case the > reorder buffer at the publisher will be a bottleneck. > This is a point to investigate if we observe so. But till now in our internal testing parallel apply gives good improvement in pgbench kind of workload. -- With Regards, Amit Kapila.