Hi,

I spent a little bit of time trying to explain the problem we are facing 
clearly, and provide a reproducible benchmark.

So here it is.

What triggered our investigation is that we have a PostgreSQL cluster 
containing about 15 databases, most of them being used as sources for logical 
replication. This means we have about as many WAL senders active on the cluster 
at the same time.

What we saw was that we had very high spikes of CPU activity, with very high 
level of SYS (up to 50% of the whole system was SYS, with 100% load on all 
CPUs) on the server, which is a dual Xeon Silver 4110, so 16 cores, 32 threads. 
That seemed insane as our usual load isn't that high on average (like 20% total 
cpu use, with ~ 1% of SYS), and mostly on 2 of those 15 databases. This mostly 
occured when creating an index, or doing batch updates, copys, etc. And all WAL 
senders consumed about the same amount, even those connected to databases where 
nothing happened. And while testing, we noticed things were worse with PG 12 
than with PG 10 (that was the cause for the first patch Pierre posted, which 
was more of a way to get PostmasterIsAlive out of the way for PG 10 to get the 
same behaviour on both databases and confirm what was different between the two 
versions).

So that was what the first benchmark we did, what Pierre posted a few days ago. 
With the second patch (reducing calls to GetFlushRecPtr), on PostgreSQL 12, 
with statements affecting lots of records at once, we managed to reduce the WAL 
senders' consumption by a factor of 15 (if the patch is correct of course). SYS 
was down to more sensible (near 0) values. WAL senders for databases which had 
no decoding to do didn't consume that much anymore, only the one connected to 
the database doing the work used a lot of CPU, but that's expected. This solves 
the problem we are facing. Without this, we won't be able to upgrade to PG 12, 
as the impact of GetFlushRecPtr is even worse than with PG 10.


I've now tried to measure the impact of the patch on a more evenly balanced 
activity on several databases, where the contention on GetFlushRecPtr is less 
severe, to see if there are wins in all cases. Simple scripts to test this are 
provided as an attachment.

Just set the "max" environment variable to the amount of databases/WAL senders 
you want, and then run create_logical.sh (creates the databases and the slots), then 
connect_logical.sh (connects pg_recvlogical processes to these slots), and run_stress.sh 
(runs a pgbench on each database, each doing 100000 transactions, all in parallel). 
drop_logical.sh does the cleanup. Sorry, they are very basic...

Here are the results on patched/unpatched PG 12. You have the messages from all 
pgbenchs, then the consumption of all WAL senders at the end of a run.

As you can see, pgbench runs are ~ 20% faster. That's because with unpatched, 
we are CPU starved (the server is 100% CPU), which we aren't with patched. WAL 
senders needed about half the CPU in the patched case as shown by pidstat.

UNPATCHED:

transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.943 ms
tps = 514.796454 (including connections establishing)
tps = 514.805610 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.949 ms
tps = 513.130790 (including connections establishing)
tps = 513.168135 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.950 ms
tps = 512.946425 (including connections establishing)
tps = 512.961746 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.951 ms
tps = 512.643065 (including connections establishing)
tps = 512.678765 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.953 ms
tps = 512.159794 (including connections establishing)
tps = 512.178075 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.953 ms
tps = 512.024962 (including connections establishing)
tps = 512.034590 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.953 ms
tps = 512.016413 (including connections establishing)
tps = 512.034438 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.954 ms
tps = 511.728080 (including connections establishing)
tps = 511.760138 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.954 ms
tps = 511.655046 (including connections establishing)
tps = 511.678533 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.955 ms
tps = 511.604767 (including connections establishing)
tps = 511.617562 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.955 ms
tps = 511.525593 (including connections establishing)
tps = 511.558150 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.955 ms
tps = 511.496498 (including connections establishing)
tps = 511.505871 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.956 ms
tps = 511.334434 (including connections establishing)
tps = 511.363141 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.956 ms
tps = 511.256908 (including connections establishing)
tps = 511.284577 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.957 ms
tps = 511.021219 (including connections establishing)
tps = 511.041905 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.957 ms
tps = 510.977176 (including connections establishing)
tps = 511.004730 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.958 ms
tps = 510.818341 (including connections establishing)
tps = 510.870735 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.958 ms
tps = 510.719611 (including connections establishing)
tps = 510.741600 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.959 ms
tps = 510.460165 (including connections establishing)
tps = 510.504649 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.961 ms
tps = 509.962481 (including connections establishing)
tps = 509.978043 (excluding connections establishing)

~$ pidstat -C postgres  -l -u -T CHILD
Linux 4.9.0-11-amd64 (hardy)    01/06/2020      _x86_64_        (32 CPU)

02:39:30 PM   UID       PID    usr-ms system-ms  guest-ms  Command
02:39:30 PM   111     78644   2232190    284870         0  
/tmp/pg-12-unpatched/bin/postgres
02:39:30 PM   111     78646       400       340         0  postgres: 
checkpointer
02:39:30 PM   111     78647       530      1330         0  postgres: background 
writer
02:39:30 PM   111     78648       300       150         0  postgres: walwriter
02:39:30 PM   111     78649        30        40         0  postgres: autovacuum 
launcher
02:39:30 PM   111     78650       130       440         0  postgres: stats 
collector
02:39:30 PM   111     78790     86340     20560         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78792     89170     23270         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78794     86990     20740         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78795     91900     22540         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78797     92000     23190         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78799     94060     22520         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78801     95300     21500         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78803     93120     21360         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78805     95420     21920         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78807     94400     21350         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78809     88850     20390         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78811     90030     20690         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78812     94310     22660         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78813     94080     22470         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78814     95370     21520         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78815     94780     21470         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78816     92440     21080         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78817     94360     20700         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78818     92230     20760         0  postgres: walsender 
postgres [local] idle
02:39:30 PM   111     78819     90280     20780         0  postgres: walsender 
postgres [local] idle

PATCHED:

transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.680 ms
tps = 595.090858 (including connections establishing)
tps = 595.131449 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.683 ms
tps = 594.156492 (including connections establishing)
tps = 594.198624 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.684 ms
tps = 593.927387 (including connections establishing)
tps = 593.946829 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.686 ms
tps = 593.209506 (including connections establishing)
tps = 593.257556 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.686 ms
tps = 593.144977 (including connections establishing)
tps = 593.162018 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.686 ms
tps = 593.084403 (including connections establishing)
tps = 593.122178 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.686 ms
tps = 593.134657 (including connections establishing)
tps = 593.199432 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.687 ms
tps = 592.908760 (including connections establishing)
tps = 592.923386 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.687 ms
tps = 592.802027 (including connections establishing)
tps = 592.814300 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.687 ms
tps = 592.678874 (including connections establishing)
tps = 592.769759 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.687 ms
tps = 592.642501 (including connections establishing)
tps = 592.723261 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.688 ms
tps = 592.249597 (including connections establishing)
tps = 592.262962 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.690 ms
tps = 591.867795 (including connections establishing)
tps = 591.958672 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.690 ms
tps = 591.835950 (including connections establishing)
tps = 591.908940 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.690 ms
tps = 591.799816 (including connections establishing)
tps = 591.824497 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.690 ms
tps = 591.738978 (including connections establishing)
tps = 591.780258 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.691 ms
tps = 591.530490 (including connections establishing)
tps = 591.570876 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.691 ms
tps = 591.452142 (including connections establishing)
tps = 591.498424 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.693 ms
tps = 590.674305 (including connections establishing)
tps = 590.708951 (excluding connections establishing)
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 100000
number of transactions actually processed: 100000/100000
latency average = 1.693 ms
tps = 590.517240 (including connections establishing)
tps = 590.563531 (excluding connections establishing)


~$ pidstat -C postgres  -l -u -T CHILD
Linux 4.9.0-11-amd64 (hardy)    01/06/2020      _x86_64_        (32 CPU)

02:29:02 PM   UID       PID    usr-ms system-ms  guest-ms  Command
02:29:02 PM   111     75810   2185430    294190         0  
/tmp/pg-12-patched/bin/postgres
02:29:02 PM   111     75812       410       320         0  postgres: 
checkpointer
02:29:02 PM   111     75813       410      1230         0  postgres: background 
writer
02:29:02 PM   111     75814       270        80         0  postgres: walwriter
02:29:02 PM   111     75815        30        20         0  postgres: autovacuum 
launcher
02:29:02 PM   111     75816       130       360         0  postgres: stats 
collector
02:29:02 PM   111     75961     35890     27240         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75963     37390     27950         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75965     38360     28110         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75966     38350     28160         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75968     38370     28160         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75970     37820     28110         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75972     38250     27330         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75974     36870     27640         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75976     36890     26850         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75979     36920     26330         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75980     37090     27240         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75981     38040     28210         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75982     36530     27460         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75983     37560     27330         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75984     36660     27170         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75985     36370     27020         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75986     36960     27000         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75987     36460     26820         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75988     36290     27140         0  postgres: walsender 
postgres [local] idle
02:29:02 PM   111     75989     36320     26750         0  postgres: walsender 
postgres [local] idle


Regard,

Marc

Attachment: perf_test_logical.tar.gz
Description: application/gzip

Reply via email to