Hi all, With the help of Andrew and Dilip Kumar, I made a poc patch to dump all the databases in archive format and then restore them using pg_restore.
Brief about the patch: new option to pg_dumpall: -F, --format=d|p (directory|plain) output file format (directory, plain text (default)) Ex: ./pg_dumpall --format=directory --file=dumpDirName dumps are as: global.dat ::: global sql commands in simple plain format map.dat. ::: dboid dbname ---entries for all databases in simple text form databases. ::: subdir dboid1 -> toc.dat and data files in archive format subdir dboid2. -> toc.dat and data files in archive format etc --------------------------------------------------------------------------- new options to pg_restore: -g, --globals-only restore only global objects, no databases --exclude-database=PATTERN exclude databases whose name matches PATTERN When we give -g/--globals-only option, then only restore globals, no db restoring. *Design*: When --format=directory is specified and there is no toc.dat file in the main directory, then check for global.dat and map.dat to restore all databases. If both files exist in a directory, then first restore all globals from global.dat and then restore all databases one by one from map.dat list. While restoring, skip the databases that are given with exclude-database. --------------------------------------------------------------------------- NOTE: if needed, restore single db by particular subdir Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5 -- here, 5 is the dboid of postgres db -- to get dboid, refer dbname in map.file -------------------------------------------------------------------------- Please let me know feedback for the attached patch. On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <mag...@hagander.net> wrote: > On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Magnus Hagander <mag...@hagander.net> writes: >> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart < >> nathandboss...@gmail.com> >> > wrote: >> >> Is there a particular advantage to that approach as opposed to just >> using >> >> "directory" mode for everything? >> >> > A gazillion files to deal with? Much easier to work with individual >> custom >> > files if you're moving databases around and things like that. >> > Much easier to monitor eg sizes/dates if you're using it for backups. >> >> You can always tar up the directory tree after-the-fact if you want >> one file. Sure, that step's not parallelized, but I think we'd need >> some non-parallelized copying to create such a file anyway. >> > > That would require double the disk space. > > But you can also just run pg_dump manually on each database and a > pg_dumpall -g like people are doing today -- I thought this whole thing was > about making it more convenient :) > > -- > Magnus Hagander > Me: https://www.hagander.net/ <http://www.hagander.net/> > Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/> > -- Thanks and Regards Mahendra Singh Thalor EnterpriseDB: http://www.enterprisedb.com
v01_poc_pg_dumpall_with_directory_31dec.patch
Description: Binary data