> All the changes mentioned above are included in the v13 patch. Since the > patch status is 'Ready for Committer,' I believe it is now better for > upstream inclusion, with improved details in tests and documentation. Do > you have any further suggestions?
I am not quite clear on the sample_1.out. I do like the idea of separating the sample tests, but I was thinking of something a bit more simple. What do you think of my attached, sampling.sql, test? It tests sample rate in both simple and extended query protocols and for both top level and nested levels? > If anyone has the capability to run this benchmark on machines with more > CPUs or with different queries, it would be nice. I’d appreciate any > suggestions or feedback. I wanted to share some additional benchmarks I ran as well on a r8g.48xlarge ( 192 vCPUs, 1,536 GiB of memory) configured with 16GB of shared_buffers. I also attached the benchmark.sh script used to generate the output. The benchmark is running the select-only pgbench workload, so we have a single heavily contentious entry, which is the worst case. The test shows that the spinlock (SpinDelay waits) becomes an issue at high connection counts and will become worse on larger machines. A sample_rate going from 1 to .75 shows a 60% improvement; but this is on a single contentious entry. Most workloads will likely not see this type of improvement. I also could not really observe this type of difference on smaller machines ( i.e. 32 vCPUs), as expected. ## init pgbench -i -s500 ### 192 connections pgbench -c192 -j20 -S -Mprepared -T120 --progress 10 sample_rate = 1 tps = 484338.769799 (without initial connection time) waits ----- 11107 SpinDelay 9568 CPU 929 ClientRead 13 DataFileRead 3 BufferMapping sample_rate = .75 tps = 909547.562124 (without initial connection time) waits ----- 12079 CPU 4781 SpinDelay 2100 ClientRead sample_rate = .5 tps = 1028594.555273 (without initial connection time) waits ----- 13253 CPU 3378 ClientRead 174 SpinDelay sample_rate = .25 tps = 1019507.126313 (without initial connection time) waits ----- 13397 CPU 3423 ClientRead sample_rate = 0 tps = 1015425.288538 (without initial connection time) waits ----- 13106 CPU 3502 ClientRead ### 32 connections pgbench -c32 -j20 -S -Mprepared -T120 --progress 10 sample_rate = 1 tps = 620667.049565 (without initial connection time) waits ----- 1782 CPU 560 ClientRead sample_rate = .75 tps = 620663.131347 (without initial connection time) waits ----- 1736 CPU 554 ClientRead sample_rate = .5 tps = 624094.688239 (without initial connection time) waits ----- 1741 CPU 648 ClientRead sample_rate = .25 tps = 628638.538204 (without initial connection time) waits ----- 1702 CPU 576 ClientRead sample_rate = 0 tps = 630483.464912 (without initial connection time) waits ----- 1638 CPU 574 ClientRead Regards, Sami
sampling.sql
Description: Binary data
benchmark.sh
Description: Bourne shell script