Hi, On 2025-01-31 03:30:35 -0800, Dmitry Koterov wrote: > Debugging some replication lag on a replica when the master node > experiences heavy writes. > > PG "startup recovering" eats up a lot of CPU (like 65 %user and 30 %sys), > which is a little surprising (what is it doing with all those CPU cycles? > it looked like WAL replay should be more IO bound than CPU bound?). > > Running "perf top -p <pid>", it shows this: > > Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.): > 18178814660 lost: 0/0 drop: 0/0 > Overhead Shared Object Symbol > 16.63% postgres [.] hash_search_with_hash_value
It'd be interesting to see what the paths towards hash_search_with_hash_value are. You said it's a COPY workloads, which surprises me a bit, because that should normally be a bit less sensitive to it. Perhaps you have triggers or such that prevent use of the multi-insert path? > 5.38% postgres [.] __aarch64_ldset4_sync > 4.42% postgres [.] __aarch64_cas4_acq_rel These two suggest that it might be worth compiling with an -march CPU that provides native atomics (everything above armv8.1-a, I think). > Maybe it's a red herring though, but it looks pretty suspicious. It's unfortunately not too surprising - our buffer mapping table is a pretty big bottleneck. Both because a hash table is just not a good fit for the buffer mapping table due to the lack of locality and because dynahash is really poor hash table implementation. Greetings, Andres Freund