On Mon, 7 Oct 2024 at 11:05, vignesh C <vignes...@gmail.com> wrote: > > Hi, > > By default, currently streaming of in-progress transactions for > subscriptions is disabled. All transactions are fully decoded on the > publisher before being sent to the subscriber. This approach can lead > to increased latency and reduced performance, particularly under heavy > load. By default, we could enable the parallel streaming option for > subscriptions. By doing this, incoming changes will be directly > applied by one of the available parallel apply workers. This method > significantly improves the performance of commit operations. > > I conducted a series of tests using logical replication, comparing SQL > execution times with streaming set to both parallel and off. The tests > varied the logical_decoding_work_mem setting and included the > following scenarios: a) Insert, b) Delete, c) Update, d) rollback 5% > records, e) rollback 10% records, f) rollback 20% records, g) rollback > 50% records. I have written tap tests for the same, the attached files > can be copied to src/test/subscription/t and the > logical_decoding_work_mem configuration and streaming option in create > subscription command should be changed accordingly before running the > tests. The tests were executed 5 times and the average of them was > taken. > The execution time is in seconds. > > Insert 5kk records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 37.304 | 69.465 | 46.298 > 256 KB | 36.327 | 70.671 | 48.597 > 64 MB | 41.173 | 69.228 | 40.526 > > Delete 5kk records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 42.322 | 69.404 | 39.021 > 256 KB | 43.250 | 66.973 | 35.422 > 64 MB | 44.183 | 67.873 | 34.903 > > Update 5kk records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 93.953 | 127.691 | 26.422 > 256 KB | 94.166 | 128.541 | 26.743 > 64 MB | 93.367 | 134.275 | 30.465 > > Rollback 05% records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 36.968 | 67.161 | 44.957 > 256 KB | 38.059 | 68.021 | 44.047 > 64 MB | 39.431 | 66.878 | 41.041 > > Rollback 10% records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 35.966 | 63.968 | 43.775 > 256 KB | 36.597 | 64.836 | 43.554 > 64 MB | 39.069 | 64.357 | 39.292 > > Rollback 20% records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 37.616 | 58.903 | 36.139 > 256 KB | 37.330 | 58.606 | 36.303 > 64 MB | 38.720 | 60.236 | 35.720 > > Rollback 50% records > Logical Decoding mem | Parallel | off | % Improvement > -------------------------------|-------------|---------------|------------------------ > 64 KB | 38.999 | 44.776 | 12.902 > 256 KB | 36.567 | 44.530 | 17.882 > 64 MB | 38.592 | 45.346 | 14.893 > > The machine configuration that was used is also attached. > > The tests demonstrate a significant performance improvement when using > the parallel streaming option, insert shows 40-48 %improvement, delete > shows 34-39 %improvement, update shows 26-30 %improvement. In the case > of rollback the improvement is between 12-44%, the improvement > slightly reduces with larger amounts of data being rolled back in this > case. If there's a significant amount of data to roll back, the > performance of streaming in parallel may be comparable to or slightly > lower in some instances. However, this is acceptable since commit > operations are generally more frequent than rollback operations. > > One key point to consider is that the lock on transaction objects will > be held for a longer duration when using streaming in parallel. This > occurs because the parallel apply worker initiates the transaction as > soon as streaming begins, maintaining the lock until the transaction > is fully completed. As a result, for long-running transactions, this > extended lock can hinder concurrent access that requires a lock. > > Since there is a significant percentage improvement, we should make > the default subscription streaming option parallel. Attached patch has > the change for the same. > Thoughts? > > All of these tests were conducted with both the publisher and > subscriber on the same host. I will perform additional tests with one > of the logical replication nodes on a different host and share the > results later.
I have run the tests with publisher and subscriber running on different hosts in synchronous replication mode. The tests were executed 5 times and the average of them was taken. The scripts that were used for the tests are attached. The test results for the same are: Operation/Streaming | Parallel | Off | % improvement -----------------------------|------------------|--------------|------------------------- Insert | 30.44 | 50.28 | 39.45 Delete | 27.66 | 46.80 | 40.89 Update | 57.37 | 90.16 | 36.37 Rollback 5% | 28.57 | 47.49 | 39.82 Rollback 10% | 28.63 | 46.25 | 38.09 Rollback 20% | 28.16 | 42.66 | 33.99 Rollback 50% | 28.33 | 34.78 | 18.53 The tests indicate a notable performance boost with the parallel streaming option: inserts improved by 39%, deletes by 41%, and updates by 36%. For rollback operations, the improvement ranges from 18% to 39%. Similar to the results observed in logical replication when both publisher and subscriber are on the same host (as mentioned in [1]), performance results show that running the publisher and subscriber on different hosts with the parallel streaming option outperforms the off option. [1] - https://www.postgresql.org/message-id/CALDaNm1%3DMedhW23NuoePJTmonwsMSp80ddsw%2BsEJs0GUMC_kqQ%40mail.gmail.com Regards, Vignesh
<<attachment: scripts.zip>>