On Saturday, July 15, 2023 7:47:12 PM CEST Tom Lane wrote: > I'm not sure how big a deal this is in practice: in most situations > the individual jobs are larger than they are in this toy example, > plus the initial non-parallelizable part of the restore is a bigger > bottleneck anyway with this many tables. Still, we do have one > real-world complaint, so maybe we should look into improving it.
Hi For what it's worth, at my current job it's kind of a big deal. I was going to start looking at the bad performance I got on pg_restore for some databases with over 50k tables (in 200 namespaces) when I found this thread. The dump weights in about 2,8GB, the toc.dat file is 230MB, 50 120 tables, 142 069 constraints and 73 669 indexes. HEAD pg_restore duration: 30 minutes pg_restore with latest patch from Nathan Bossart: 23 minutes This is indeed better, but there is still a lot of room for improvements. With such usecases, I was able to go much faster using the patched pg_restore with a script that parallelize on each schema instead of relying on the choices made by pg_restore. It seems the choice of parallelizing only the data loading is losing nice speedup opportunities with a huge number of objects. patched pg_restore + parallel restore of schemas: 10 minutes Anyway, the patch works really fine as is, and I will certainly keep trying future iterations. Regards Pierre