Hi guys, I've been configuring a new server and tuning Postgresql 15.3, but I'm struggling with a latency I'm consistently seeing with this new server when running fast short queries, compared to the other server.
We're running two different versions of Postgresql: - Server A: Postgresql 9.3 - Server B: Postgresql 15.3 Server B is the new server and is way more powerful than server A: - Server A: 1x Intel Xeon E3-1270 3.5GHz, 2x 8GB DDR3, RAID0 - Server B: 2x Intel Xeon Platinum 8260 2.4GHz, 4x 16GB DDR4, RAID1 We're running Linux Ubuntu 20.04 on server B and I've been tweaking some settings in Linux and Postgresql 15.3. With the current setup, Postgresql 15.3 is able to process more than 1 million transactions per second running pgbench: # pgbench --username postgres --select-only --client 100 --jobs 10 --time 20 test pgbench (15.3 (Ubuntu 15.3-1.pgdg20.04+1)) starting vacuum...end. transaction type: <builtin: select only> scaling factor: 1 query mode: simple number of clients: 100 number of threads: 10 maximum number of tries: 1 duration: 20 s number of transactions actually processed: 23039950 number of failed transactions: 0 (0.000%) latency average = 0.087 ms initial connection time = 62.536 ms tps = 1155053.135317 (without initial connection time) As shown in pgbench, the performance is great. Also when testing individual queries, heavy queries (those taking a few ms) run faster on server B than server A. Unfortunately when we run fast short SELECT queries (< 1 ms), server A is consistently running faster than server B, even if the query plans are the same: Server A: # EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT 1 AS "a" FROM "foobar" WHERE ("foobar"."id" = 1) LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.42..8.44 rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1) Output: (1) Buffers: shared hit=5 -> Index Only Scan using foobar_pkey on public.foobar (cost=0.42..8.44 rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1) Output: 1 Index Cond: (foobar.id = 1) Heap Fetches: 1 Buffers: shared hit=5 Total runtime: 0.017 ms (9 rows) Time: 0.281 ms Server B: # EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT 1 AS "a" FROM "foobar" WHERE ("foobar"."id" = 1) LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..1.11 rows=1 width=4) (actual time=0.019..0.021 rows=1 loops=1) Output: 1 Buffers: shared hit=4 -> Index Only Scan using foobar_pkey on public.foobar (cost=0.00..1.11 rows=1 width=4) (actual time=0.018..0.018 rows=1 loops=1) Output: 1 Index Cond: (foobar.id = 1) Heap Fetches: 0 Buffers: shared hit=4 Planning Time: 0.110 ms Execution Time: 0.045 ms (10 rows) Time: 0.635 ms RAID1 could add some latency on server B if it was reading from disk, but I've confirmed that these queries are hitting the buffer/cache and therefore reading data from memory and not from disk. I've checked the hit rate with the following query: SELECT 'cache hit rate' AS name, sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS ratio FROM pg_statio_user_tables; The hit rate was over 95% and it increased as soon as I ran those queries. Same thing with the index hit rate. I've been playing with some parameters in Postgresql, decreasing/increasing the number of workers, shared buffers, work_mem, JIT, cpu_*_cost variables, etc, but nothing did help to reduce that latency. Here are the settings I'm currently using with Postgresql 15.3 after a lot of work experimenting with different values: checkpoint_completion_target = 0.9 checkpoint_timeout = 900 cpu_index_tuple_cost = 0.00001 cpu_operator_cost = 0.00001 effective_cache_size = 12GB effective_io_concurrency = 200 jit = off listen_addresses = 'localhost' maintenance_work_mem = 1GB max_connections = 100 max_parallel_maintenance_workers = 4 max_parallel_workers = 12 max_parallel_workers_per_gather = 4 max_wal_size = 4GB max_worker_processes = 12 min_wal_size = 1GB random_page_cost = 1.1 shared_buffers = 4GB ssl = off timezone = 'UTC' wal_buffers = 16MB work_mem = 64MB Some notes about those settings: - We're running other services on this server, that's why I'm not using more resources. - Tweaking the cpu_*_cost parameters was crucial to improve the query plan. With the default values Postgresql was consistently using a slower query plan. I've been looking at some settings in Linux as well: - Swappiness is set to the lowest safe value: vm.swappiness = 1 - Huge Pages is not being used and Transparent Huge Pages (THP) is set to 'madvise'. Postgresql 15.3 is using the default value for the 'huge_pages' parameter: 'try'. - The memory overcommit policy is set to 1: vm.overcommit_memory = 1 I've been playing with Huge Pages, to try to force Postgresql using this feature. I manually allocated the number of Huge Pages as shown in this query: SHOW shared_memory_size_in_huge_pages; I confirmed Huge Pages were being used by Postgresql, but unfortunately I didn't see any improvement regarding latency and performance. So I set this back to the previous state. Conclusion: The latency is quite low on both servers, but when you're running dozens or hundreds of fast short queries concurrently, on aggregate you see the difference, with server A being 0.1-1.0 seconds faster than server B. As you can see, server B has 2 CPUs and is using NUMA on Linux. And the CPU clock is slower on server B than server A. Maybe any of those are causing that latency? Any suggestions or ideas where to look? I'd really appreciate your help. Thank you