Nathan Bossart <nathandboss...@gmail.com> writes: > On Thu, Mar 06, 2025 at 01:47:34PM -0500, Tom Lane wrote: >> ... I wonder if we could just rip out pg_upgrade's support >> for DB-level parallelism, which is not terribly pretty anyway, and >> simply pass the -j switch straight to pg_dump and pg_restore.
> That would certainly help for clusters with one big database with many LOs > or something, but I worry it would hurt the many database case quite a bit. I'm very skeptical of that. How many DBs do you know with just one table? I think most have enough that they could keep a reasonable number of CPUs busy with pg_dump's internal parallelism. > Maybe we could add a --jobs-per-db option that indicates how to parallelize > dump/restore. If you set --jobs=8 --jobs-per-db=8, the databases would be > dumped serially, but pg_dump would get -j8. If you set --jobs=8 and > --jobs-per-db=2, we'd process 4 databases at a time, each with -j2. I specifically didn't propose such a thing because I think it will be a sucky user experience. In the first place, users are unlikely to take the time to puzzle out exactly how they should slice that up; in the second place, if they try they won't necessarily find that there's a good solution with those knobs; in the third place, pg_upgrade is commonly invoked through packager-supplied scripts that might not give access to those switches anyway. In the short term I think repurposing -j as meaning within-DB parallelism rather than cross-DB parallelism would be a win for the vast majority of users. We could imagine some future feature that lets pg_upgrade try to slice up the available jobs on its own (say, based on a preliminary survey of how many tables in each DB). But I don't want to build that today, and maybe we won't ever. regards, tom lane