pg_dump: Retrieve attribute statistics in batches. Currently, pg_dump gathers attribute statistics with a query per relation, which can cause pg_dump to take significantly longer, especially when there are many relations. This commit addresses this by teaching pg_dump to gather attribute statistics for 64 relations at a time. Some simple tests showed this was the optimal batch size, but performance may vary depending on the workload.
Our lookahead code determines the next batch of relations by searching the TOC sequentially for relevant entries. This approach assumes that we will dump all such entries in TOC order, which unfortunately isn't true for dump formats that use RestoreArchive(). RestoreArchive() does multiple passes through the TOC and selectively dumps certain groups of entries each time. This is particularly problematic for index stats and a subset of matview stats; both are in SECTION_POST_DATA, but matview stats that depend on matview data are dumped in RESTORE_PASS_POST_ACL, while all other stats are dumped in RESTORE_PASS_MAIN. To handle this, this commit moves all statistics data entries in SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we always dump them in TOC order. A convenient side effect of this change is that we can revert a decent chunk of commit a0a4601765, but that is left for a follow-up commit. Author: Corey Huinker <corey.huin...@gmail.com> Co-authored-by: Nathan Bossart <nathandboss...@gmail.com> Reviewed-by: Jeff Davis <pg...@j-davis.com> Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/9c02e3a986daa865ecdc2e3d8183e2d83b8f4824 Modified Files -------------- src/bin/pg_dump/pg_backup.h | 5 +- src/bin/pg_dump/pg_backup_archiver.c | 29 +++---- src/bin/pg_dump/pg_dump.c | 147 ++++++++++++++++++++++++++++++----- 3 files changed, 142 insertions(+), 39 deletions(-)