On Wed, Jul 6, 2022 at 9:06 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > How would you choose the slot name for the table sync, right now it > contains the relid of the table for which it needs to perform sync? > Say, if we ignore to include the appropriate identifier in the slot > name, we won't be able to resue/drop the slot after restart of table > sync worker due to an error.
I had a quick look into the patch and it seems it is using the worker array index instead of relid while forming the slot name, and I think that make sense, because now whichever worker is using that worker index can reuse the slot created w.r.t that index. > > > > With those changes, I did some benchmarking to see if it improves anything. > > This results compares this patch with the latest version of master branch. > > "max_sync_workers_per_subscription" is set to 2 as default. > > Got some results simply averaging timings from 5 consecutive runs for each > > branch. > > > > First, tested logical replication with empty tables. > > 10 tables > > ---------------- > > - master: 286.964 ms > > - the patch: 116.852 ms > > > > 100 tables > > ---------------- > > - master: 2785.328 ms > > - the patch: 706.817 ms > > > > 10K tables > > ---------------- > > - master: 39612.349 ms > > - the patch: 12526.981 ms > > > > > > Also tried replication tables with some data > > 10 tables loaded with 10MB data > > ---------------- > > - master: 1517.714 ms > > - the patch: 1399.965 ms > > > > 100 tables loaded with 10MB data > > ---------------- > > - master: 16327.229 ms > > - the patch: 11963.696 ms > > > > > > Then loaded more data > > 10 tables loaded with 100MB data > > ---------------- > > - master: 13910.189 ms > > - the patch: 14770.982 ms > > > > 100 tables loaded with 100MB data > > ---------------- > > - master: 146281.457 ms > > - the patch: 156957.512 > > > > > > If tables are mostly empty, the improvement can be significant - up to 3x > > faster logical replication. > > With some data loaded, it can still be faster to some extent. > > > > These results indicate that it is a good idea, especially for very small > tables. > > > When the table size increases more, the advantage of reusing workers > > becomes insignificant. > > > > It seems from your results that performance degrades for large > relations. Did you try to investigate the reasons for the same? Yeah, that would be interesting to know that why there is a drop in some cases. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com