Re: pg_restore scan

R Wahyudi Thu, 18 Sep 2025 14:37:32 -0700

I've been given a database dump file daily and I've been asked to restore
it.
I tried everything I could to speed up the process, including using -j 40.


I discovered that at the later stage of the restore process,  the
following behaviour repeated a few times :
40 x pg_restore process doing 100% CPU
40 x  postgres process doing COPY but using 0% CPU
..... and zero disk write activity

I don't see this behaviour when restoring the database that was dumped with
-Fd.
Also with an un-piped backup file, I can restore a specific table without
having to wait for hours.


--





On Fri, 19 Sept 2025 at 01:54, Adrian Klaver <adrian.kla...@aklaver.com>
wrote:

> On 9/18/25 05:58, R Wahyudi wrote:
> > Hi All,
> >
> > Thanks for the quick and accurate response!  I never been so happy
> > seeing IOwait on my system!
>
> Because?
>
> What did you find?
>
> >
> > I might be blind as  I can't find information about 'offset' in pg_dump
> > documentation.
> > Where can I find more info about this?
>
> It is not in the user documentation.
>
>  From the thread Ron referred to, there is an explanation here:
>
> https://www.postgresql.org/message-id/366773.1756749256%40sss.pgh.pa.us
>
> I believe the actual code, for the -Fc format, is in pg_backup_custom.c
> here:
>
>
> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_backup_custom.c#L723
>
> Per comment at line 755:
>
> "
>   If possible, re-write the TOC in order to update the data offset
> information.  This is not essential, as pg_restore can cope in most
> cases without it; but it can make pg_restore significantly faster
> in some situations (especially parallel restore).  We can skip this
> step if we're not dumping any data; there are no offsets to update
> in that case.
> "
>
> >
> > Regards,
> > Rianto
> >
> > On Wed, 17 Sept 2025 at 13:48, Ron Johnson <ronljohnso...@gmail.com
> > <mailto:ronljohnso...@gmail.com>> wrote:
> >
> >
> >     PG 17 has integrated zstd compression, while --format=directory lets
> >     you do multi-threaded dumps.  That's much faster than a single-
> >     threaded pg_dump into a multi-threaded compression program.
> >
> >     (If for _Reasons_ you require a single-file backup, then tar the
> >     directory of compressed files using the --remove-files option.)
> >
> >     On Tue, Sep 16, 2025 at 10:50 PM R Wahyudi <rwahy...@gmail.com
> >     <mailto:rwahy...@gmail.com>> wrote:
> >
> >         Sorry for not including the full command - yes , its piping to a
> >         compression command :
> >           | lbzip2 -n <threadsforbzipgoeshere>--best > <filenamegoeshere>
> >
> >
> >         I think we found the issue! I'll do further testing and see how
> >         it goes !
> >
> >
> >
> >
> >
> >         On Wed, 17 Sept 2025 at 11:02, Ron Johnson
> >         <ronljohnso...@gmail.com <mailto:ronljohnso...@gmail.com>>
> wrote:
> >
> >             So, piping or redirecting to a file?  If so, then that's the
> >             problem.
> >
> >             pg_dump directly to a file puts file offsets in the TOC.
> >
> >             This how I do custom dumps:
> >             cd $BackupDir
> >             pg_dump -Fc --compress=zstd:long -v -d${db} -f ${db}.dump
> >               2> ${db}.log
> >
> >             On Tue, Sep 16, 2025 at 8:54 PM R Wahyudi
> >             <rwahy...@gmail.com <mailto:rwahy...@gmail.com>> wrote:
> >
> >                 pg_dump was done using the following command :
> >                 pg_dump -Fc -Z 0 -h <host> -U <user> -w -d <database>
> >
> >                 On Wed, 17 Sept 2025 at 08:36, Adrian Klaver
> >                 <adrian.kla...@aklaver.com
> >                 <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> >                     On 9/16/25 15:25, R Wahyudi wrote:
> >                      >
> >                      > I'm trying to troubleshoot the slowness issue
> >                     with pg_restore and
> >                      > stumbled across a recent post about pg_restore
> >                     scanning the whole file :
> >                      >
> >                      >  > "scanning happens in a very inefficient way,
> >                     with many seek calls and
> >                      > small block reads. Try strace to see them. This
> >                     initial phase can take
> >                      > hours in a huge dump file, before even starting
> >                     any actual restoration."
> >                      > see : https://www.postgresql.org/message-id/
> >                     E48B611D-7D61-4575-A820- <https://
> >
> www.postgresql.org/message-id/E48B611D-7D61-4575-A820->
> >                      > B2C3EC2E0551%40gmx.net <http://40gmx.net>
> >                     <https://www.postgresql.org/message-id/ <https://
> >                     www.postgresql.org/message-id/>
> >                      > E48B611D-7D61-4575-A820-B2C3EC2E0551%40gmx.net
> >                     <http://40gmx.net>>
> >
> >                     This was for pg_dump output that was streamed to a
> >                     Borg archive and as
> >                     result had no object offsets in the TOC.
> >
> >                     How are you doing your pg_dump?
> >
> >
> >
> >                     --
> >                     Adrian Klaver
> >                     adrian.kla...@aklaver.com
> >                     <mailto:adrian.kla...@aklaver.com>
> >
> >
> >
> >             --
> >             Death to <Redacted>, and butter sauce.
> >             Don't boil me, I'm still alive.
> >             <Redacted> lobster!
> >
> >
> >
> >     --
> >     Death to <Redacted>, and butter sauce.
> >     Don't boil me, I'm still alive.
> >     <Redacted> lobster!
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>

Re: pg_restore scan

Reply via email to