Backpatch critical performance fixes to pgarch.c This backpatches commits beb4e9ba1652 and 1fb17b190341 (originally appearing in previously in REL_15_STABLE) to REL_14_STABLE. Performance of the WAL archiver can become pretty critical at times, and reports exist of users getting in serious trouble (hours of downtime, loss of replicas) because of lack of this optimization.
We'd like to backpatch these to REL_13_STABLE too, but because of the very invasive changes made by commit d75288fb27b8 in the 14 timeframe, we deem it too risky :-( Original commit messages appear below. Discussion: https://postgr.es/m/202411131605.m66syq5i5ucl@alvherre.pgsql commit beb4e9ba1652a04f66ff20261444d06f678c0b2d Author: Robert Haas <rh...@postgresql.org> AuthorDate: Thu Nov 11 15:02:53 2021 -0500 Improve performance of pgarch_readyXlog() with many status files. Presently, the archive_status directory was scanned for each file to archive. When there are many status files, say because archive_command has been failing for a long time, these directory scans can get very slow. With this change, the archiver remembers several files to archive during each directory scan, speeding things up. To ensure timeline history files are archived as quickly as possible, XLogArchiveNotify() forces the archiver to do a new directory scan as soon as the .ready file for one is created. Nathan Bossart, per a long discussion involving many people. It is not clear to me exactly who out of all those people reviewed this particular patch. Discussion: http://postgr.es/m/CA+TgmobhAbs2yabTuTRkJTq_kkC80-+jw=pfpypdoj7+gab...@mail.gmail.com Discussion: http://postgr.es/m/620f3ce1-0255-4d66-9d87-0eade8669...@amazon.com commit 1fb17b1903414676bd371068739549cd2966fe87 Author: Tom Lane <t...@sss.pgh.pa.us> AuthorDate: Wed Dec 29 17:02:50 2021 -0500 Fix issues in pgarch's new directory-scanning logic. The arch_filenames[] array elements were one byte too small, so that a maximum-length filename would get corrupted if another entry were made after it. (Noted by Thomas Munro, fix by Nathan Bossart.) Move these arrays into a palloc'd struct, so that we aren't wasting a few kilobytes of static data in each non-archiver process. Add a binaryheap_reset() call to make it plain that we start the directory scan with an empty heap. I don't think there's any live bug of that sort, but it seems fragile, and this is very cheap insurance. Cleanup for commit beb4e9ba1, so no back-patch needed. Discussion: https://postgr.es/m/ca+hukglhajhukuwtzsw7umjf4bvpcqrl-umzg_hm-g0y7yl...@mail.gmail.com Branch ------ REL_14_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/4abf615cc8a8ca80430b5a0bfa18be6efcea96b2 Modified Files -------------- src/backend/access/transam/xlogarchive.c | 14 +++ src/backend/postmaster/pgarch.c | 208 +++++++++++++++++++++++++++---- src/include/postmaster/pgarch.h | 1 + 3 files changed, 197 insertions(+), 26 deletions(-)