Hi all,
With the help of Andrew and Dilip Kumar, I made a poc patch to dump all the
databases in archive format and then restore them using pg_restore.

Brief about the patch:
new option to pg_dumpall:
-F, --format=d|p (directory|plain) output file format (directory, plain
text (default))

Ex: ./pg_dumpall --format=directory --file=dumpDirName

dumps are as:
global.dat ::: global sql commands in simple plain format
map.dat.   ::: dboid dbname ---entries for all databases in simple text form
databases. :::
      subdir     dboid1  -> toc.dat and data files in archive format
      subdir     dboid2. -> toc.dat and data files in archive format
              etc
---------------------------------------------------------------------------

new options to pg_restore:
-g, --globals-only           restore only global objects, no databases
--exclude-database=PATTERN   exclude databases whose name matches PATTERN

When we give -g/--globals-only option, then only restore globals, no db
restoring.

*Design*:
When --format=directory is specified and there is no toc.dat file in the
main directory, then check
for global.dat and map.dat to restore all databases. If both files exist in
a directory,
then first restore all globals from global.dat and then restore all
databases one by one
from map.dat list.
While restoring, skip the databases that are given with exclude-database.

---------------------------------------------------------------------------
NOTE:
if needed, restore single db by particular subdir

Ex: ./pg_restore --format=directory -d postgres dumpDirName/databases/5
   -- here, 5 is the dboid of postgres db
   -- to get dboid, refer dbname in map.file
--------------------------------------------------------------------------

Please let me know feedback for the attached patch.

On Tue, 11 Jun 2024 at 01:06, Magnus Hagander <mag...@hagander.net> wrote:

> On Mon, Jun 10, 2024 at 6:21 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
>> Magnus Hagander <mag...@hagander.net> writes:
>> > On Mon, Jun 10, 2024 at 5:03 PM Nathan Bossart <
>> nathandboss...@gmail.com>
>> > wrote:
>> >> Is there a particular advantage to that approach as opposed to just
>> using
>> >> "directory" mode for everything?
>>
>> > A gazillion files to deal with? Much easier to work with individual
>> custom
>> > files if you're moving databases around and things like that.
>> > Much easier to monitor eg sizes/dates if you're using it for backups.
>>
>> You can always tar up the directory tree after-the-fact if you want
>> one file.  Sure, that step's not parallelized, but I think we'd need
>> some non-parallelized copying to create such a file anyway.
>>
>
> That would require double the disk space.
>
> But you can also just run pg_dump manually on each database and a
> pg_dumpall -g like people are doing today -- I thought this whole thing was
> about making it more convenient :)
>
> --
>  Magnus Hagander
>  Me: https://www.hagander.net/ <http://www.hagander.net/>
>  Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
>


-- 
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment: v01_poc_pg_dumpall_with_directory_31dec.patch
Description: Binary data

Reply via email to