Re: [HACKERS] Further pg_upgrade analysis for many tables

Jeff Janes Fri, 09 Nov 2012 16:24:09 -0800

On Fri, Nov 9, 2012 at 3:06 PM, Bruce Momjian <br...@momjian.us> wrote:
> On Thu, Nov  8, 2012 at 08:59:21PM -0800, Jeff Janes wrote:
>> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <br...@momjian.us> wrote:
>> >
>> > I am actually now dumping git head/9.3, so I assume all the problems we
>> > know about should be fixed.
>>
>> Are sure the server you are dumping out of is head?
>
> Well, I tested again with 9.2 dumping/loading 9.2 and the same for git
> head, and got these results:
>
>                        pg_dump            restore
>                    9.2       git       9.2       git
>
>             1      0.13      0.11      0.07      0.07
>          1000      4.37      3.98      4.32      5.28
>          2000     12.98     12.19     13.64     14.25
>          4000     47.85     50.14     61.31     70.97
>          8000    210.39    183.00    302.67    294.20
>         16000    901.53    769.83   1399.25   1359.09


For pg_dump, there are 4 possible combinations, not just two.  you can
use 9.2's pg_dump to dump from a 9.2 server, use git's pg_dump to dump
from a 9.2 server, use git's pg_dump to dump from a git server, or use
9.2's pg_dump to dump from a git server (although that last one isn't
very relevant)

>
> As you can see, there is very little difference between 9.2 and git
> head, except maybe at the 16k level for pg_dump.
>
> Is there some slowdown with a mismatched version dump/reload?  I am
> attaching my test script.

Sorry, from the script I can't really tell what versions are being
used for what.


>
>> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to
>> dump 16,000 tables (schema only) like your example, and it is
>> definitely quadratic.
>
> Are you using a SERIAL column for the tables.  I am, and Peter
> Eisentraut reported that was a big slowdown.

Yes, I'm using the same table definition as your example.


>
>> But using head's pg_dump do dump tables out of head's server, it only
>> took 24.95 seconds, and the quadratic term is not yet important,
>> things still look linear.
>
> Again, using SERIAL?

Yep.

>> Is the next value, increment, etc. for a sequence stored in a catalog,
>> or are they stored in the 8kb file associated with each sequence?  If
>
> Each sequence is stored in its own 1-row 8k table:
>
>         test=> CREATE SEQUENCE seq;
>         CREATE SEQUENCE
>
>         test=> SELECT * FROM seq;
>         -[ RECORD 1 ]-+--------------------
>         sequence_name | seq
>         last_value    | 1
>         start_value   | 1
>         increment_by  | 1
>         max_value     | 9223372036854775807
>         min_value     | 1
>         cache_value   | 1
>         log_cnt       | 0
>         is_cycled     | f
>         is_called     | f
>
>> they are stored in the file, than it is shame that pg_dump goes to the
>> effort of extracting that info if pg_upgrade is just going to
>> overwrite it anyway.
>
> Actually, pg_upgrade needs pg_dump to restore all those sequence values.

I did an experiment where I had pg_dump just output dummy values
rather than hitting the database.  Once pg_upgrade moves the relation
files over, the dummy values disappear and are set back to their
originals.  So I think that pg_upgrade depends on pg_dump only in a
trivial way--they need to be there, but it doesn't matter what they
are.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further pg_upgrade analysis for many tables

Reply via email to