On Mon, 7 Oct 2024 at 11:05, vignesh C <vignes...@gmail.com> wrote:
>
> Hi,
>
> By default, currently streaming of in-progress transactions for
> subscriptions is disabled. All transactions are fully decoded on the
> publisher before being sent to the subscriber. This approach can lead
> to increased latency and reduced performance, particularly under heavy
> load. By default, we could enable the parallel streaming option for
> subscriptions. By doing this, incoming changes will be directly
> applied by one of the available parallel apply workers. This method
> significantly improves the performance of commit operations.
>
> I conducted a series of tests using logical replication, comparing SQL
> execution times with streaming set to both parallel and off. The tests
> varied the logical_decoding_work_mem setting and included the
> following scenarios: a) Insert, b) Delete, c) Update, d) rollback 5%
> records, e) rollback 10% records, f) rollback 20% records, g) rollback
> 50% records. I have written tap tests for the same, the attached files
> can be copied to src/test/subscription/t and the
> logical_decoding_work_mem configuration and streaming option in create
> subscription command should be changed accordingly before running the
> tests. The tests were executed 5 times and the average of them was
> taken.
> The execution time is in seconds.
>
> Insert 5kk records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
>  64 KB                           | 37.304   | 69.465      |     46.298
> 256 KB                          | 36.327   | 70.671      |     48.597
>  64 MB                          | 41.173   | 69.228      |     40.526
>
> Delete 5kk records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
>  64 KB                           | 42.322   | 69.404      |     39.021
> 256 KB                          | 43.250   | 66.973      |     35.422
>  64 MB                           | 44.183   | 67.873      |     34.903
>
> Update 5kk records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
> 64 KB                           | 93.953    | 127.691    |     26.422
> 256 KB                         | 94.166    | 128.541    |     26.743
>  64 MB                         | 93.367    | 134.275    |     30.465
>
> Rollback 05% records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
> 64 KB                           | 36.968    | 67.161      |     44.957
> 256 KB                         | 38.059    | 68.021      |     44.047
>  64 MB                         | 39.431    | 66.878      |     41.041
>
> Rollback 10% records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
> 64 KB                           | 35.966    | 63.968      |     43.775
> 256 KB                         | 36.597    | 64.836      |     43.554
>  64 MB                         | 39.069    | 64.357      |     39.292
>
> Rollback 20% records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
> 64 KB                           | 37.616    | 58.903      |     36.139
> 256 KB                         | 37.330    | 58.606      |     36.303
>  64 MB                         | 38.720    | 60.236      |     35.720
>
> Rollback 50% records
> Logical Decoding mem |  Parallel |  off            |  % Improvement
> -------------------------------|-------------|---------------|------------------------
> 64 KB                           | 38.999    | 44.776      |     12.902
> 256 KB                         | 36.567    | 44.530      |     17.882
>  64 MB                         | 38.592     | 45.346      |     14.893
>
> The machine configuration that was used is also attached.
>
> The tests demonstrate a significant performance improvement when using
> the parallel streaming option, insert shows 40-48 %improvement, delete
> shows 34-39 %improvement, update shows 26-30 %improvement. In the case
> of rollback the improvement is between 12-44%, the improvement
> slightly reduces with larger amounts of data being rolled back in this
> case. If there's a significant amount of data to roll back, the
> performance of streaming in parallel may be comparable to or slightly
> lower in some instances. However, this is acceptable since commit
> operations are generally more frequent than rollback operations.
>
> One key point to consider is that the lock on transaction objects will
> be held for a longer duration when using streaming in parallel. This
> occurs because the parallel apply worker initiates the transaction as
> soon as streaming begins, maintaining the lock until the transaction
> is fully completed. As a result, for long-running transactions, this
> extended lock can hinder concurrent access that requires a lock.
>
> Since there is a significant percentage improvement, we should make
> the default subscription streaming option parallel. Attached patch has
> the change for the same.
> Thoughts?
>
> All of these tests were conducted with both the publisher and
> subscriber on the same host. I will perform additional tests with one
> of the logical replication nodes on a different host and share the
> results later.

I have run the tests with publisher and subscriber running on
different hosts in synchronous replication mode. The tests were
executed 5 times and the average of them was taken. The scripts that
were used for the tests are attached.

The test results for the same are:
Operation/Streaming  |      Parallel   |      Off      |  % improvement
-----------------------------|------------------|--------------|-------------------------
Insert                          |  30.44          |  50.28      |  39.45
Delete                        |  27.66           |  46.80      |  40.89
Update                       |  57.37           |  90.16      |  36.37
Rollback 5%               |  28.57           |  47.49      |  39.82
Rollback 10%             |  28.63           |  46.25      |  38.09
Rollback 20%             |  28.16           |  42.66      |  33.99
Rollback 50%             |  28.33           |  34.78      |  18.53

The tests indicate a notable performance boost with the parallel
streaming option: inserts improved by 39%, deletes by 41%, and updates
by 36%. For rollback operations, the improvement ranges from 18% to
39%. Similar to the results observed in logical replication when both
publisher and subscriber are on the same host (as mentioned in [1]),
performance results show that running the publisher and subscriber on
different hosts with the parallel streaming option outperforms the off
option.

[1] - 
https://www.postgresql.org/message-id/CALDaNm1%3DMedhW23NuoePJTmonwsMSp80ddsw%2BsEJs0GUMC_kqQ%40mail.gmail.com

Regards,
Vignesh

<<attachment: scripts.zip>>

Reply via email to