On 08/07/2025 2:51 pm, Amit Kapila wrote:
On Tue, Jul 8, 2025 at 12:06 PM Konstantin Knizhnik <knizh...@garret.ru> wrote:
There is well known Postgres problem that logical replication subscriber
can not caught-up with publisher just because LR changes are applied by
single worker and at publisher changes are made by
multiple concurrent backends. The problem is not logical replication
specific: physical replication stream is also handled by single
walreceiver. But for physical replication Postgres now implements
prefetch: looking at WAL record blocks it is quite easy to predict which
pages will be required for redo and prefetch them. With logical
replication situation is much more complicated.

My first idea was to implement parallel apply of transactions. But to do
it we need to track dependencies between transactions. Right now
Postgres can apply transactions in parallel, but only if they are
streamed  (which is done only for large transactions) and serialize them
by commits. It is possible to enforce parallel apply of short
transactions using `debug_logical_replication_streaming` but then
performance is ~2x times slower than in case of sequential apply by
single worker.

What is the reason of such a large slow down? Is it because the amount
of network transfer has increased without giving any significant
advantage because of the serialization of commits?


It is not directly related with subj, but I do not understand this code:

```
    /*
     * Stop the worker if there are enough workers in the pool.
     *
     * XXX Additionally, we also stop the worker if the leader apply worker
     * serialize part of the transaction data due to a send timeout. This is      * because the message could be partially written to the queue and there      * is no way to clean the queue other than resending the message until it
     * succeeds. Instead of trying to send the data which anyway would have
     * been serialized and then letting the parallel apply worker deal with
     * the spurious message, we stop the worker.
     */
    if (winfo->serialize_changes ||
        list_length(ParallelApplyWorkerPool) >
        (max_parallel_apply_workers_per_subscription / 2))
    {
        logicalrep_pa_worker_stop(winfo);
        pa_free_worker_info(winfo);

        return;
    }
```

It stops worker if number fo workers in pool is more than half of `max_parallel_apply_workers_per_subscription`. What I see is that `pa_launch_parallel_worker` spawns new workers and after completion of transaction it is immediately terminated.
Actually this leads to awful slowdown of apply process.
If I just disable and all `max_parallel_apply_workers_per_subscription`are actually used for applying transactions, then time of parallel apply with 4 workers is 6 minutes comparing with 10 minutes fr applying all transactions by main workers. It is still not so larger improvement, but at least it is improvement and not degradation.



Reply via email to